Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update remark-abbr to work with remark@next #516

Open
wants to merge 4 commits into
base: next
Choose a base branch
from

Conversation

richardTowers
Copy link

This is a rough and ready port of richardTowers/remark-abbr over the zmarkdown.

I've tried to stick as closely to the original tests as possible - most of them pass with only some minor changes to the fixtures. I've brought over my own tests (converted to Jest) for the micromark extension, but I haven't brought in my remark-abbr tests just yet.

I've added support for expandFirst to get as much of the test suite passing as possible.

I've dropped about half the type annotations I had in the original - I'm not sure if fully typing each plugin is a goal of zmarkdown, so it didn't seem worth it to fix all the comments. If it's not a goal, I'll remove the remaining type annotations.

The remark-abbr plugin is also making a half hearted attempt to keep track of positional information, but I'm fairly confident it's wrong so I should probably strip it out. I think the micromark syntax extension gets the positions mostly right, but transforming the tree and splitting out the text nodes makes the positions very confusing (particularly with expandFirst).

This is basically the same as the version in
https://github.com/richardTowers/remark-abbr, except I've converted the
tests from node:test to jest to be consistent with the rest of
zmarkdown.

This only exports a syntax extension - there's no HTML extension
(because micromark doesn't currently support the hooks we'd need to
implement one - see https://github.com/orgs/micromark/discussions/181
for more detail). It's only intended to be used in remark-abbr.
This implementation is pretty much taken verbatim from
https://github.com/richardTowers/remark-abbr. I didn't make much of an
attempt to merge it with the existing code.

The existing test suite is retained, and I haven't moved any of the
tests over from richardTowers/remark-abbr yet.

A couple of the snapshots needed to be updated, but in my view what the
code does now is more correct than what was happening before, so I think
they're okay.

I haven't implemented the expandFirst functionality just yet.
@StaloneLab
Copy link
Member

Hello, very sorry I have little time right now, but I saw your PR and will have a look!

@richardTowers
Copy link
Author

Hello! Absolutely no rush at all, it's not blocking anything for me. Take your time! (But also thanks for having a look)

Copy link
Member

@StaloneLab StaloneLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erh, I had wrote a comment but apparently these are not saved by Github. Overall okay, I left a few comments here and there, but have another point that disturbs me: currently, the micromark plugin does not work as-is. I usually try to do it, to allow people without AST (including me sometimes), ie. using micromark outside of remark, to get things done. I am however not sure if this is currently possible with micromark. Have you tried it and got no success at it?

I will do a second review later, after your answers and fixups, especially for remark-abbr, since I will need to run the code to understand a few points. Finally, to answer your questions:

  • we do not care about type annotations per-se, and use them nowhere in the repository, but if we can have them for free for this plugin, and it requires almost no work on our side, then I see no reason to drop them I guess;
  • about positional information, if you noticed them wrong, then please remove them, it is not worth the fight except if you really want it. If somebody wants it done someday, then he will make a PR. I do not want to bother checking them if you already know they are not right currently (but will if you fix everything, even though I say it is not needed). I would say these are perfect candidates for test-driven development (ie. write first a test describing the positions you expect for a few cases) if you want to work on it.

Thanks again for your work, this is looking like we may be able to ship quite easily.

*/
function abbrKeyDefinition (code) {
if (code === codes.leftSquareBracket) {
return factoryLabel.call(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, I did not know such a plugin existed.

// Note: whitespace is optional.
const isSpace = markdownLineEndingOrSpace(code)
return isSpace
? factoryWhitespace(effects, abbrValueStart)(code)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means one space exactly (overall one or none) or multiple spaces are allowed? Our current plugin allows multiple, even though I am not saying this is a good behavior.

.write(preprocess()(input, null, true)),
)
const eventTypes = events.map((event) => [event[0], event[1].type])
expect(eventTypes).toEqual(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of these kind of tests, am quite worried that they might be too micromark-dependant, and breaking on newer versions. Did you saw them used in another project?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the tests. Would like to see a few (you do not need to reproduce all of them) to see full Markdown to HTML conversion. This means also writing a small (usually ridiculously small) file like this one. Maybe in your case it is actually not so simple since you do not directly compile to HTML?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be very awake now, but are you able to explain changes to this file? I see about three different types of changes:

  • _ being replaced with *, I do not know why this happens;
  • noabbr is now present. I think this is a problem. Before, we used to not compile to Markdown abbrs that were not present in the text. Seems easy to keep this behavior, so I would like to, except if you disagree.
  • there are now line breaks between abbrs, I don't think we care.

.use(remarkAbbr, config)
.use(remark2rehype, {
handlers: {
abbrDefinition: () => undefined,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be handled in a way or another by the remark plugin. We cannot expect user to have this weird custom configuration every time. Or is it something test-related that I am missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add dependency to micromark-extension-abbr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to remark-iframes to see how to handle it on the monorepo.

}
})
.filter((match) =>
// We don't want to match "HTML" inside strings like "HHHHTMLLLLLL", so check that the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this actually existed before, since currently HTML works, but I have no idea how it works. Will have to check, because abbr appears in tree.

// surrounding characters are either undefined (i.e. start of string / end of string)
// or non-word characters
[match.prevChar, match.nextChar].every(
(c) => c === undefined || /^\W$/.test(c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with \W because it does not work outside of ASCII characters. For example try the plugin with abbr *[НПО]: неправительственная организация (russian for NGO) and input text НННПООО (same thing as what you did for HTML, I doubled the letters). The trailing Н (not latin H) and leading О (not latin O) are taken for non-word characters. But they are word characters, in another language. I suggest to use a proper plugin instead, I think we use something similar somewhere, search for unicode in project dependencies.

@StaloneLab
Copy link
Member

PS: do not bother about tests on Node 18 I would say, especially since they fail only there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants