Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty lines do not break options sections. #52

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions docopt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -161,16 +161,24 @@ std::vector<T*> flat_filter(Pattern& pattern) {
}

static std::vector<std::string> parse_section(std::string const& name, std::string const& source) {
// There is no a multiline strings concept in std::regex, therefore the symbols `^` and `$` match
// only once at the start and at the end of a string, even if this string contains new line
// characters. For this reason, following constructions are used instead:
// (?:^|\\n) - start of a line;
// (?=\\n|$) - end of a line.
//
// ECMAScript regex only has "?=" for a non-matching lookahead. In order to make sure we always have
// a newline to anchor our matching, we have to avoid matching the final newline of each grouping.
// Therefore, our regex is adjusted from the docopt Python one to use ?= to match the newlines before
// the following lines, rather than after.
//
// The wildcard `.` matches any single character including the newline character in Boost.Regex. So,
// `[^\\n]` construction is used instead.
std::regex const re_section_pattern {
"(?:^|\\n)" // anchored at a linebreak (or start of string)
"("
"[^\\n]*" + name + "[^\\n]*(?=\\n?)" // a line that contains the name
"(?:\\n[ \\t].*?(?=\\n|$))*" // followed by any number of lines that are indented
")",
"(?:^|\\n)(" // A section begins at start of a line and consists of:
"[^\\n]*" + name + "[^\\n]*" // - a line that contains the section's name; and
"(?:" // - several
"\\n+[ \\t][^\\n]*" // indented lines possibly separated by empty lines.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that there is a better solution for this issue. But that is how it looks like now, so I decided that it is okay to re-use this approach. After all, this issue must be addressed rather in a parallel pull request than in this one.

")*"
")(?=\\n|$)", // The section ends at the end of a line.
std::regex::icase
};

Expand Down
18 changes: 18 additions & 0 deletions testcases.docopt
Original file line number Diff line number Diff line change
Expand Up @@ -955,3 +955,21 @@ other options:
"""
$ prog --baz --egg
{"--foo": false, "--baz": true, "--bar": false, "--egg": true, "--spam": false}


# An empty line must not break an options section.
r"""
Usage: prog [options]

Options:
--before-empty-lines An option before empty lines.


--after-empty-lines An option after empty lines.
"""

$ prog --before-empty-lines
{"--before-empty-lines": true, "--after-empty-lines": false}

$ prog --after-empty-line
{"--before-empty-lines": false, "--after-empty-lines": true}