Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore xxh algo and benchmark #588

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cristianbote
Copy link
Owner

@cristianbote cristianbote commented Aug 26, 2024

This is related and in response of emotion-js/emotion#3241

cc @romgrk

The weird part -- but kinda expected -- was that on node 18.17.0 goober is constantly second but then drops with newer versions.

~/work/github/js-benchmark (master*) $ node ./index.js ./benchmarks/hashing.js 
### node: v18.17.0 ###
CASE: murmur2_unmodified: #########################.........................   50.17% (301 ops) (3.326 ms/op)
CASE: murmur2:            ######################################............   76.17% (457 ops) (2.190 ms/op)
CASE: xxh:                ##################################################  100.00% (600 ops) (1.667 ms/op)
CASE: fnv-1a:             #####################################.............   74.83% (449 ops) (2.227 ms/op)
CASE: djb2a:              ##########################################........   85.67% (514 ops) (1.946 ms/op)
CASE: goober_unmodified:  ############################################......   89.67% (538 ops) (1.861 ms/op)

~/work/github/js-benchmark (master*) $ nvm use v18.18.0 
Now using node v18.18.0 (npm v9.8.1)

~/work/github/js-benchmark (master*) $ node ./index.js ./benchmarks/hashing.js 
### node: v18.18.0 ###
CASE: murmur2_unmodified: #######################...........................   46.71% (298 ops) (3.356 ms/op)
CASE: murmur2:            ######################################............   77.74% (496 ops) (2.016 ms/op)
CASE: xxh:                ##################################################  100.00% (638 ops) (1.567 ms/op)
CASE: fnv-1a:             ###################################...............   71.16% (454 ops) (2.205 ms/op)
CASE: djb2a:              ########################################..........   81.19% (518 ops) (1.932 ms/op)
CASE: goober_unmodified:  ######################################............   77.43% (494 ops) (2.024 ms/op)

Besides that it really is peculiar what is happening on bun that runs so slow?!

Anyway, I've copy/pasted the xxh on goober's hash perf test, and these are the results:

node v22.7.0
~/work/github/goober (explore-xxh-algo-and-benchmark*) $ npm run test-perf-hash

> [email protected] test-perf-hash
> NODE_ENV=production node benchmarks/perf-hash.cjs


Starting HASH!
▸ goober original HASH x 2,830,730 ops/sec ±0.81% (94 runs sampled)
▸ goober optimized HASH x 12,688,564 ops/sec ±3.04% (88 runs sampled)
▸ goober optimized HASH with ASM hints x 13,045,723 ops/sec ±2.04% (92 runs sampled)
▸ twind HASH x 2,617,884 ops/sec ±2.34% (93 runs sampled)
▸ xxh x 6,105,582 ops/sec ±0.49% (93 runs sampled)

Fastest is: goober optimized HASH with ASM hints

and

node v18.17.0
~/work/github/goober (explore-xxh-algo-and-benchmark*) $ npm run test-perf-hash

> [email protected] test-perf-hash
> NODE_ENV=production node benchmarks/perf-hash.cjs


Starting HASH!
▸ goober original HASH x 2,735,026 ops/sec ±3.08% (89 runs sampled)
▸ goober optimized HASH x 13,061,639 ops/sec ±0.89% (90 runs sampled)
▸ goober optimized HASH with ASM hints x 12,969,513 ops/sec ±1.12% (92 runs sampled)
▸ twind HASH x 2,653,173 ops/sec ±0.68% (93 runs sampled)
▸ xxh x 6,017,804 ops/sec ±0.73% (90 runs sampled)

Fastest is: goober optimized HASH

Looking forward to your thoughts

Copy link

vercel bot commented Aug 26, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
goober-rocks ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 26, 2024 6:56am

Copy link

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Copy link

Size Change: 0 B

Total Size: 3.84 kB

ℹ️ View Unchanged
Filename Size Change
dist/goober.esm.js 1.26 kB 0 B
dist/goober.modern.js 1.26 kB 0 B
dist/goober.umd.js 1.32 kB 0 B

compressed-size-action

@romgrk
Copy link

romgrk commented Aug 26, 2024

I'm exploring the disassembly of xxh on V8, it looks like it's doing a lot of validation and doesn't have confidence yet that it's indeed a Smi value despite all the existing asm.js annotations, so I went overboard and added them everywhere
image

It does seem to help all engines:
image

xxh and xxh_unmodified in this commit, I'll just keep the optimized version to continue the exploration.


I've also established in emotion-js/emotion#3241 (comment) that the final string formatting function can have a big impact, so goober doing a simple 'go' + number might be more efficient than xxh doing .toString(36).


I need to explore more before giving a better answer but I'll include the updated goober version in my benchmarks, though the collision rate makes it a bad candidate for emotion. I'll try to figure if I can find a better algo for your requirements (super low byte size).

@romgrk
Copy link

romgrk commented Aug 26, 2024

Note that you don't return anything from the goober hash functions in your tests, that might also impact the result (engines could ellicit an unused constant value with no side-effect). Also your benchmark has a single constant string as dataset instead of a proper parameter, that's another factor that could be used for optimization by the engines. It would be better to build a representative datasets of strings you see in production, and run the perf tests on that dataset.

@romgrk
Copy link

romgrk commented Aug 26, 2024

For the record, here is a confirmation that 'go' + number (decimal formatting) is very efficient on v8. No significant difference in other engines:

image

@cgiosy
Copy link

cgiosy commented Nov 5, 2024

Hello. I slightly modified the code to make xxh32 suitable for benchmarking, and obtained the following results:

# Node (v22.6.0)

Starting HASH!
▸ twind HASH x 3,099,745 ops/sec ±0.40% (100 runs sampled)
▸ goober original HASH x 5,780,316 ops/sec ±0.44% (95 runs sampled)
▸ goober optimized HASH x 14,695,271 ops/sec ±0.67% (95 runs sampled)
▸ goober optimized HASH with ASM hints x 14,728,415 ops/sec ±0.54% (95 runs sampled)
▸ xxh32 x 17,213,817 ops/sec ±0.31% (98 runs sampled)

Fastest is: xxh32


# Bun (1.1.34)

Starting HASH!
▸ twind HASH x 8,288,343 ops/sec ±0.74% (94 runs sampled)
▸ goober original HASH x 1,672,256 ops/sec ±0.17% (98 runs sampled)
▸ goober optimized HASH x 2,075,684 ops/sec ±0.96% (93 runs sampled)
▸ goober optimized HASH with ASM hints x 10,446,495 ops/sec ±0.90% (93 runs sampled)
▸ xxh32 x 25,015,040 ops/sec ±1.87% (86 runs sampled)

Fastest is: xxh32

Actually, considering the bundle size of goober, using xxh32 is overkill. I just wrote this because there was talk about my library. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants