Support inspect on strings with unicode #79

dubzzz · 2024-05-07T16:54:12Z

Fixes #78

dubzzz · 2024-05-07T16:55:54Z

test/strings.js

+      expect(inspect('🐱🐱🐱', { truncate: 5 })).to.equal("'🐱…'")
+    })
+
+    it('truncates strings involving graphemes than truncate (5)', () => {


Supporting graphemes, is way more complex than surrogate pairs. As such splitting a grapheme (aka a visual unit from a user point of view) is acceptable for Unicode as the string remains valid. On the opposite splitting a surrogate pair is responsible to create strings that are considered invalid from an Unicode point of view as they contain illegal characters.

dubzzz · 2024-05-07T17:01:29Z

test/strings.js

@@ -59,6 +59,22 @@ describe('strings', () => {
      expect(inspect('foobarbaz', { truncate: 3 })).to.equal("'…'")
    })

+    it('truncates strings involving surrogate pairs longer than truncate (7)', () => {
+      expect(inspect('🐱🐱🐱', { truncate: 7 })).to.equal("'🐱🐱…'") // not '🐱🐱\ud83d…'


I'm the author of fast-check: a property based testing library. This case is well-suited for such tests, so I have to at least let you know of it. Test would be something like:

it('truncates strings into valid unicode strings', () => { fc.assert(fc.property(fc.fullUnicodeString(), fc.nat(), (text, truncate) => { expect(() => encodeURI(inspect(text, { truncate }))).not.to.throw() })) })

And it would cover this specific case but possibly other that might came one day.

Sounds awsome! Is there a chai integration 😉?

No dedicated chai integration. At least nothing deeper than the suggestion I made 😅

I had a chat with maintainers of Jest in the past to wrap assert+property into some kind of expect(...).toSomething(...) but I/we never pushed the idea further. The only integration I have are mostly connecting it/test of the test runners with the library. But nothing at the level of expect itself.

@keithamus let me know if the technique might be interesting for loupe/chaijs? If you feel it might I can try to propose a pull request with some tests using the technique (I can work on such PR even if it never gets merged)

dubzzz · 2024-05-07T19:56:26Z

Note: Based on the note that loupe is reimplementing part of Node.js' util.inspect() I started to have a look into how the built-in library was dealing with such strings. Actually it does not handle them properly and truncates them in the middle of pairs and thus produces invalid strings.

43081j · 2024-05-07T20:00:24Z

Note: Based on the note that loupe is reimplementing part of Node.js' util.inspect() I started to have a look into how the built-in library was dealing with such strings. Actually it does not handle them properly and truncates them in the middle of pairs and thus produces invalid strings.

this was going to be my main question. what does node do?

i suppose we need to decide if it makes sense to match node or to make it "visually sensible" (assuming node just splits it down the middle like loupe does)

keithamus

Nice work! I was going to comment to the effect of "let's try to just truncate left if we're in the middle of a surrogate pair" in #78 but I thought I'd hold back to see what you came up with. This looks like an elegant fix and I'm glad we're tackling it nicely. A+ from me.

dubzzz · 2024-05-08T09:51:46Z

Woups, lint is not passing I'll fix it

src/helpers.ts

dubzzz · 2024-05-08T19:11:31Z

🟢 Lint fixed
🟢 Shorter and more performant version of surrogate check

Following the discussion started on chaijs#79. I'm opening this PR as an opened question and suggestion to better cover the library and its edge-cases. First, what is property-based testing? It's a technic coming from functional world and aiming to help devs into detecting edge cases without having them to think of them too much. Why, property-based testing? Well, in order to reduce the risks of bugs and potentially regressions on key libraries. Could it found problems? Well, probably. I tried another property but I'm not sure whether or not the failure is considered as ok or not. As such I'm not sure of what to expect from truncate so I was not able to make a decision. ```js it('produces strings with at most <truncate> characters', () => { fc.assert( fc.property( fc.anything({ withBigInt: true, withBoxedValues: true, withDate: true, withMap: true, withNullPrototype: true, withObjectString: true, withSet: true, withSparseArray: true, withTypedArray: true, withUnicodeString: true, }), fc.nat(), (source, truncate) => { const output = inspect(source, { truncate }) expect(output).to.have.lengthOf.at.most(truncate) } ) ) }) ``` Who am I? I'm the author of fast-check, the library added by this PR. It's the leading property-based testing library today.

Support inspect on strings with unicode

279c1a8

dubzzz commented May 7, 2024

View reviewed changes

dubzzz mentioned this pull request May 7, 2024

Producing unstable snapshots vitest-dev/vitest#5681

Closed

6 tasks

keithamus approved these changes May 7, 2024

View reviewed changes

dubzzz commented May 8, 2024

View reviewed changes

src/helpers.ts Outdated Show resolved Hide resolved

dubzzz added 2 commits May 8, 2024 13:47

Update src/helpers.ts

8500a40

fix lint

20f94c3

keithamus merged commit e02467e into chaijs:main May 11, 2024
6 checks passed

dubzzz mentioned this pull request May 23, 2024

test: Add property-based tests #80

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support inspect on strings with unicode #79

Support inspect on strings with unicode #79

dubzzz commented May 7, 2024

dubzzz May 7, 2024

dubzzz May 7, 2024

keithamus May 7, 2024

dubzzz May 7, 2024

dubzzz May 11, 2024

dubzzz commented May 7, 2024

43081j commented May 7, 2024 •

edited

Loading

keithamus left a comment

dubzzz commented May 8, 2024

dubzzz commented May 8, 2024

Support inspect on strings with unicode #79

Support inspect on strings with unicode #79

Conversation

dubzzz commented May 7, 2024

dubzzz May 7, 2024

Choose a reason for hiding this comment

dubzzz May 7, 2024

Choose a reason for hiding this comment

keithamus May 7, 2024

Choose a reason for hiding this comment

dubzzz May 7, 2024

Choose a reason for hiding this comment

dubzzz May 11, 2024

Choose a reason for hiding this comment

dubzzz commented May 7, 2024

43081j commented May 7, 2024 • edited Loading

keithamus left a comment

Choose a reason for hiding this comment

dubzzz commented May 8, 2024

dubzzz commented May 8, 2024

43081j commented May 7, 2024 •

edited

Loading