Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't append a newline character when rendering inline nodes. #559

Merged
merged 1 commit into from
Sep 7, 2024

Conversation

ioquatix
Copy link
Contributor

See #558 for context.

@ioquatix ioquatix force-pushed the cmark-no-inline-newline branch 3 times, most recently from 9be6ac0 to 9b5d8d1 Compare August 19, 2024 04:59
@ioquatix ioquatix changed the title No inline newline Don't append a newline character when rendering inline nodes. Aug 19, 2024
@ioquatix
Copy link
Contributor Author

While working on this, I wondered if we could upstream some of the cmark-gfm improvements, e.g.

#define CMARK_NODE_TYPE_PRESENT (0x8000)
#define CMARK_NODE_TYPE_BLOCK (CMARK_NODE_TYPE_PRESENT | 0x0000)
#define CMARK_NODE_TYPE_INLINE (CMARK_NODE_TYPE_PRESENT | 0x4000)
#define CMARK_NODE_TYPE_MASK (0xc000)
#define CMARK_NODE_VALUE_MASK (0x3fff)

typedef enum {
  /* Error status */
  CMARK_NODE_NONE = 0x0000,

  /* Block */
  CMARK_NODE_DOCUMENT       = CMARK_NODE_TYPE_BLOCK | 0x0001,
  CMARK_NODE_BLOCK_QUOTE    = CMARK_NODE_TYPE_BLOCK | 0x0002,
  CMARK_NODE_LIST           = CMARK_NODE_TYPE_BLOCK | 0x0003,
  CMARK_NODE_ITEM           = CMARK_NODE_TYPE_BLOCK | 0x0004,
  CMARK_NODE_CODE_BLOCK     = CMARK_NODE_TYPE_BLOCK | 0x0005,
  CMARK_NODE_HTML_BLOCK     = CMARK_NODE_TYPE_BLOCK | 0x0006,
  CMARK_NODE_CUSTOM_BLOCK   = CMARK_NODE_TYPE_BLOCK | 0x0007,
  CMARK_NODE_PARAGRAPH      = CMARK_NODE_TYPE_BLOCK | 0x0008,
  CMARK_NODE_HEADING        = CMARK_NODE_TYPE_BLOCK | 0x0009,
  CMARK_NODE_THEMATIC_BREAK = CMARK_NODE_TYPE_BLOCK | 0x000a,
  CMARK_NODE_FOOTNOTE_DEFINITION = CMARK_NODE_TYPE_BLOCK | 0x000b,

  /* Inline */
  CMARK_NODE_TEXT          = CMARK_NODE_TYPE_INLINE | 0x0001,
  CMARK_NODE_SOFTBREAK     = CMARK_NODE_TYPE_INLINE | 0x0002,
  CMARK_NODE_LINEBREAK     = CMARK_NODE_TYPE_INLINE | 0x0003,
  CMARK_NODE_CODE          = CMARK_NODE_TYPE_INLINE | 0x0004,
  CMARK_NODE_HTML_INLINE   = CMARK_NODE_TYPE_INLINE | 0x0005,
  CMARK_NODE_CUSTOM_INLINE = CMARK_NODE_TYPE_INLINE | 0x0006,
  CMARK_NODE_EMPH          = CMARK_NODE_TYPE_INLINE | 0x0007,
  CMARK_NODE_STRONG        = CMARK_NODE_TYPE_INLINE | 0x0008,
  CMARK_NODE_LINK          = CMARK_NODE_TYPE_INLINE | 0x0009,
  CMARK_NODE_IMAGE         = CMARK_NODE_TYPE_INLINE | 0x000a,
  CMARK_NODE_FOOTNOTE_REFERENCE = CMARK_NODE_TYPE_INLINE | 0x000b,
} cmark_node_type;

was a nice way to organize the enum.

src/node.h Outdated
@@ -86,6 +86,22 @@ struct cmark_node {

CMARK_EXPORT int cmark_node_check(cmark_node *node, FILE *out);

static inline bool CMARK_NODE_TYPE_BLOCK_P(cmark_node_type node_type) {
return node_type < CMARK_NODE_TEXT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that cmark.h defines CMARK_NODE_LAST_BLOCK and CMARK_NODE_FIRST_BLOCK, as well as CMARK_NODE_FIRST_INLINE and CMARK_NODE_LAST_INLINE. So an easy way to check for blocks is

node->type >= CMARK_NODE_FIRST_BLOCK && node->type <= CMARK_NODE_LAST_BLOCK

etc. This is easy enough that I don't think it's really necessary to create these functions.

Copy link
Contributor Author

@ioquatix ioquatix Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to establish some level of consistency with the interface introduced by cmark-gfm. Also I prefer to express intent, e.g. "is this a block" is the intent, not "is this great than some arbitrary thing and less than some other arbitrary thing".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it may be a good idea in the end. But I'd prefer to keep that issue separate from this change.

@jgm
Copy link
Member

jgm commented Aug 29, 2024

Looks good. (I made one comment.)

As for the other suggestion, it should probably be considered in a separate issue or PR.
It does seem a good idea.

@jgm
Copy link
Member

jgm commented Aug 29, 2024

Also, note the linter error in CI. To keep things simple, let's dispense with defining these extra functions and just use the method noted above in my comment.

@ioquatix
Copy link
Contributor Author

Okay, I will update the PR.

src/render.c Outdated
Comment on lines 151 to 154
static inline int cmark_is_block(cmark_node *node) {
return node->type >= CMARK_NODE_FIRST_BLOCK && node->type <= CMARK_NODE_LAST_BLOCK;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the naming conventions we've been using are that cmark_* is used for the functions in the public API. So I think it may introduce confusion to name this function cmark_is_block. Maybe use is_block or S_is_block (a conventioned that can be found in e.g. blocks.c.))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this already exists in node.c:

static inline bool S_is_block(cmark_node *node) {
  if (node == NULL) {
    return false;
  }
  return node->type >= CMARK_NODE_FIRST_BLOCK &&
         node->type <= CMARK_NODE_LAST_BLOCK;
}

static inline bool S_is_inline(cmark_node *node) {
  if (node == NULL) {
    return false;
  }
  return node->type >= CMARK_NODE_FIRST_INLINE &&
         node->type <= CMARK_NODE_LAST_INLINE;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a good reason to extract this into some decent public interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd be in favor of (a) moving these to cmark.h/c, and making them public.
Maybe also (b) re-organizing the enums the way cmark-gfm does, to make the tests more efficient.

@nwellnhof can you think of any reason not to do these things?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd be in favor of (a) moving these to cmark.h/c, and making them public.

+1

Maybe also (b) #559 (comment), to make the tests more efficient.

The test is efficient enough. I think this is more about extensibility. Technically, changing the enum values is an ABI break. This shouldn't be a problem as long as we continue to bump the soname with each release and people don't hardcode the constants. But I don't see a good reason to make the change as long as we don't add new node types. If cmark_is_block is made public, we should think about deprecating CMARK_NODE_FIRST_BLOCK and CMARK_NODE_LAST_BLOCK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another function we should consider making public is S_is_leaf from src/iterator.c.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's not change the enums at this point. But let's add cmark_is_block, cmark_is_inline, and cmark_is_leaf to the public API in cmark.h.

@ioquatix do you want to do this as part of this PR, or should we do that separately and have you rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to do it as a separate PR, and I'll rebase this one.

@jgm jgm merged commit fe807df into commonmark:master Sep 7, 2024
15 checks passed
@jgm
Copy link
Member

jgm commented Sep 7, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants