Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve section detection #327

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

keewis
Copy link
Contributor

@keewis keewis commented May 24, 2021

Potentially fixes #316.

This tries to allow parsing sections which are not separated by blank lines (there should probably be a warning in that case, I'll add that once the general idea has been approved). In order to get that to work used a few tricks (e.g. add a optional doc parameter to _is_at_section to allow calling it on a different reader) so it might need some refactoring before being truly ready.

cc @Carreau, my main motivation was trying to get velin to auto-fix this

@keewis
Copy link
Contributor Author

keewis commented May 24, 2021

yielding StopIteration seems like a bug:

if name.startswith('..'): # index section
yield name, data[1:]
elif len(data) < 2:
yield StopIteration
else:
yield name, self._strip(data[2:])

because it will cause this to fail with a obscure error:

sections = list(self._read_sections())
section_names = set([section for section, content in sections])

I think this should be fixed, either by changing the yield value to yield StopIteration, StopIteration or yield "", [], or by raising an error instead (if that's what that was supposed to indicate).

@keewis
Copy link
Contributor Author

keewis commented Jan 16, 2022

Ping. Does anyone have any comments on this?

@rgommers
Copy link
Member

Ping. Does anyone have any comments on this?

I think the discussion in gh-316 shows this is not desirable. Detecting in order to raise a better warning or even an exception makes sense though.

@keewis
Copy link
Contributor Author

keewis commented Jan 3, 2023

apologies for pinging and then forgetting about it for a year.

I implemented the requested change, such that it will now warn on every missing empty line (between summary and the first section, or between two sections).

The current implementation comes with a (slight?) performance regression because NumpyDocString._is_at_section is now called for every line, and I had to do a slightly ugly trick to make sure _is_at_section doesn't swallow empty lines.

Instead, I could also imagine doing a two-pass implementation: find all separators (multiple - or = per line) and check if those belong to a section which is not preceded by an empty line in the first pass, then run the actual extraction in the second pass. That might make the code a bit simpler and potentially faster, but every line would be visited twice (I'm not sure how much of an issue that would be).

Edit: I think the CI failures are unrelated (not sure, though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support (detect?) missing blank line before Returns sections.
2 participants