Switch to faster hashing for deflate
See original GitHub issueI saw what was mentioned in #135 and #136 and decided to try out the performance benchmarks for myself.
Selected samples: (1 of 1)
> lorem_1mb
Sample: lorem_1mb.txt (1000205 bytes raw / ~257018 bytes compressed)
> deflate-fflate x 29.27 ops/sec ±1.97% (52 runs sampled)
> deflate-fflate-string x 25.85 ops/sec ±2.58% (47 runs sampled)
> deflate-imaya x 7.41 ops/sec ±2.72% (22 runs sampled)
> deflate-pako x 16.39 ops/sec ±1.44% (44 runs sampled)
> deflate-pako-string x 15.05 ops/sec ±0.50% (41 runs sampled)
> deflate-pako-untyped x 11.53 ops/sec ±1.06% (31 runs sampled)
> deflate-uzip x 29.20 ops/sec ±2.41% (51 runs sampled)
> deflate-zlib x 25.09 ops/sec ±0.41% (45 runs sampled)
> inflate-fflate x 252 ops/sec ±1.07% (87 runs sampled)
> inflate-fflate-string x 141 ops/sec ±0.75% (78 runs sampled)
> inflate-imaya x 159 ops/sec ±1.37% (83 runs sampled)
> inflate-pako x 193 ops/sec ±0.32% (87 runs sampled)
> inflate-pako-string x 59.54 ops/sec ±0.31% (61 runs sampled)
> inflate-pako-untyped x 65.32 ops/sec ±1.22% (64 runs sampled)
> inflate-uzip x 274 ops/sec ±0.73% (86 runs sampled)
> inflate-zlib x 589 ops/sec ±0.29% (93 runs sampled)
If you would like to experiment with the patches I made, I forked the repo: 101arrowz/pako.
Note the addition of fflate
and uzip
. uzip
was mentioned in the previous issues but is, as you mentioned, slightly unstable. It hangs on bad input rather than throwing an error, cannot be configured much, cannot be used asynchronously or in parallel (multiple calls to inflate
in separate callbacks, for example, causes it to fail), and, worst of all, does not support streaming.
I created fflate
to resolve these issues while implementing some of the popular features requested here (e.g. ES6 modules). In addition, fflate
adds asynchronous support with Web and Node workers (offload to a separate thread entirely rather than just add to event loop) and ZIP support that is parallelized to easily beat out JSZip. The bundle size ends up much smaller than pako
too because of the much simpler code. As you can see it is much faster than pako
.
The purpose of this issue is not to boast about my library or to call pako
a bad one. I believe pako
can become much faster while still being more similar than fflate
or uzip
to the real zlib
(i.e. the internal lib/
directory). The overhauled implementation in Node 12 as mentioned in #193 is the perfect chance to escape the trap of trying to match the C source nearly line-for-line and instead become a canonical but high-performance rework in JavaScript. For this reason, I suggest you try to optimize some of the slower parts of the library. I found an even better hashing function that produced very few collisions in my testing, and for larger files/image files it outperforms even Node.js zlib
in C; if you’d like to discuss, please let me know.
Personally: thank you for your incredible contributions to the open source community. I use pako
in multiple projects and greatly appreciate its stability, not present in libraries such as my own.
Issue Analytics
- State:
- Created 3 years ago
- Comments:26 (15 by maintainers)
I found reason why inflate speed decrease - because compression ratio much worse (bigger inflate input => slower)
Inrease of memLevel to 9 (for 64k hash table) does not help.
I’ve started to experiment with the codebase and will report back if I make progress. The hashing is not centralized, i.e. you set it in many different places, so I need to deep-dive into the code to tune its performance.
On another note, I saw you pushed some new commits for v2.0.0. If you’re going to be releasing a new version, I’d recommend you consider a new idea I came up with: auto-workerization.
Typical workerization techniques do not allow you to reuse existing synchronous code in an asynchronous fashion, but I developed a function that can take a function (along with all of its dependencies), generate an inline worker string, and cache it for higher performance. It’s been specifically optimized to work even after mangling, minification, and usage in any bundler. More importantly, you can reference other functions, classes, and static consants with this new method.
The main reason I suggest this is that very, very few people want to cause the environment (which could very well be a user’s browser) to hang while running a CPU intensive task on the main thread, such as
deflate
. Offering a method to offload this to a worker automatically is incredibly useful for many less-experienced developers who don’t understand workers and just want high performance and good UX.You can see how I did it in fflate if you’d like to consider it.