Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow parsing into an existing node #524

Merged
merged 3 commits into from
Jan 21, 2024
Merged

Conversation

Ericson2314
Copy link
Contributor

@Ericson2314 Ericson2314 commented Jan 18, 2024

Variation on #522 for same purpose. A good deal smaller. First commit is implementation, second commit is test.

@Ericson2314
Copy link
Contributor Author

@jgm @nwellnhof does this look better?

src/cmark.h Show resolved Hide resolved
src/cmark.h Outdated
Comment on lines 477 to 501
/** Creates a new parser object with the given node to use as the root
* node of the parsed AST.
*
* This is useful for creating a single document out of multiple parsed
* document fragments.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, I wondered: what happens if the specified root node already has children? What happens then? Do the new nodes get added to the end of the children? Or do they replace existing children? Maybe the comment should be explicit about this.

Copy link
Contributor Author

@Ericson2314 Ericson2314 Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They appear after, see add_child in blocks.c.

I do wish it and cmark_node_append_child shared code though to make that clearer, however.

Same could be said for make_block and cmark_node_new.

- The current page seems rendered at 72 chars, so match that.

- `---` and `--` in the convert dashes docs were being converted, so it
  looked like a no-op! (i.e. em dash becomes em dash, en dash becomes en
  dash.)
@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Jan 21, 2024

I noticed that cmark.3 was stale. I fixed it a a new preparatory commit, but I think it would be nice to either not commit the generated file, or have a CI action that ensures it is up to date.

@jgm
Copy link
Member

jgm commented Jan 21, 2024

cmark.3 is updated as part of make all.
Since I use the Makefile for releases, it will always be updated before a release.

@jgm jgm merged commit 0d77ca1 into commonmark:master Jan 21, 2024
12 checks passed
@Ericson2314 Ericson2314 deleted the parse-into branch January 21, 2024 18:35
@nwellnhof
Copy link
Contributor

I would prefer an API like

int cmark_parser_set_target(cmark_parser *parser, cmark_node *target);
cmark_node *cmark_parser_get_target(cmark_parser *parser);

mainly because it makes it easier to support this feature in language bindings. For memory management, the reference from the parser object to the node object must be tracked and the get_target function provides easy access to this reference.

Besides, it would get more and more clumsy if we wanted to support additional settings in the parser constructor.

I'm aware that it's cleaner to avoid mutation of parser state and to set all options when constructing the parser object. Functions like cmark_parser_set_target should return an error if they're called after cmark_parser_feed.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Jan 22, 2024

What if we had a struct cmark_parser_args where everything was nullable // optional? Flexible (don't deal with args you don't care about) and also no new stateful-ness for the parser itself?

@nwellnhof
Copy link
Contributor

nwellnhof commented Jan 22, 2024

Exposing structs in public APIs should be avoided because modifying the struct in future versions would break the ABI. There are some tricks like adding a version field to the struct but that's rather ugly.

The cleanest solution is to have a separate object for parser settings, leading to something like

cmark_parser_settings *settings = cmark_parser_settings_new();
cmark_parser_settings_set_options(settings, CMARK_OPT_SMART);
cmark_parser_settings_set_target(settings, target);
cmark_parser *parser = cmark_parser_new_with_settings(settings);
// ...
cmark_parser_free(parser);
cmark_parser_settings_free(settings);

For a real-world example look at pthread_mutex_init and pthread_mutexattr_t in pthreads. But making such a change to the libcmark API seems a bit intrusive to me.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Jan 22, 2024

Ooo I like that. Making another opaque type to keep some state out of the parser feels good to me. Fewer edge cases, etc., and not too hard to to clients still.

Ericson2314 added a commit to NixOS/nix that referenced this pull request Jan 27, 2024
Needed until commonmark/cmark#524 is released.

• Added input 'cmark':
    'github:commonmark/cmark/cd37711b8a08da67ba4e21a42614b86dd8def929' (2024-01-26)
Ericson2314 added a commit to NixOS/nix that referenced this pull request Mar 25, 2024
Needed until commonmark/cmark#524 is released.

• Added input 'cmark':
    'github:commonmark/cmark/cd37711b8a08da67ba4e21a42614b86dd8def929' (2024-01-26)
Ericson2314 added a commit to NixOS/nix that referenced this pull request Oct 11, 2024
Needed until commonmark/cmark#524 is released.

• Added input 'cmark':
    'github:commonmark/cmark/cd37711b8a08da67ba4e21a42614b86dd8def929' (2024-01-26)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants