Invalid gzip data (fflate gzipped the file) and corruption on decompression (Deno)
See original GitHub issueIssue Description
I am seeing a few issues with Gzip/Gunzip on Deno.
One is, compressing then decompressing modest size text files, give corrupt and much smaller uncompressed version. Another is compressing a modest sized binary file is generating an invalid compressed file format, and when attempting to decompress get an Invalid gzip data exception.
I wrote the following test to demonstrate the problem:
import * as fflate from 'https://cdn.skypack.dev/fflate';
async function zpipe(reader: Deno.Reader, stream: any) {
let total = 0;
async function push(p: Uint8Array, isLast?: boolean) {
console.log('push', p.byteLength);
debugger;
await stream.push(p, isLast);
total += p.byteLength;
}
let prevBlock;
for await (const block of Deno.iter(reader)) {
if (prevBlock) await push(prevBlock);
prevBlock = block;
}
if (prevBlock) await push(prevBlock, true);
console.log(`pushed ${total} bytes`);
}
function zip(from: string, to: string, options = {}) {
let total = 0;
return new Promise<void>(async (resolve, reject) => {
const hFrom = await Deno.open(from, { read: true });
const hTo = await Deno.open(to, { write: true, create: true, truncate: true });
const zipper: any = new fflate.Gzip({ level: 9 }, async (chunk: Uint8Array, isLast: boolean) => {
console.log('zip write chunk', chunk.byteLength);
await hTo.write(chunk);
total += chunk.byteLength;
if (isLast) {
console.log(`zip close dest file, ${total} bytes`);
hTo.close();
resolve();
}
});
await zpipe(hFrom, zipper);
console.log('zip close source file');
hFrom.close();
});
}
function unzip(from: string, to: string) {
let total = 0;
return new Promise<void>(async (resolve, reject) => {
const hFrom = await Deno.open(from, { read: true });
const hTo = await Deno.open(to, { write: true, create: true, truncate: true });
const unzipper: any = new fflate.Gunzip();
unzipper.ondata = async (chunk: Uint8Array, isLast: boolean) => {
console.log('unzip write chunk', chunk.byteLength);
await hTo.write(chunk);
total += chunk.length;
if (isLast) {
console.log(`unzip close dest file, ${total} bytes`);
hTo.close();
resolve();
}
};
await zpipe(hFrom, unzipper);
console.log('unzip close source file');
hFrom.close();
});
}
const fn = Deno.args[0];
await zip(fn, `${fn}.gz`);
await unzip(`${fn}.gz`, `${fn}.unzipped`);
As a test, I downloaded fflate.js and compressed and decompressed that using the code above:
deno run --allow-all gzip.ts fflate.js
The resulting file sizes are:
-a---- 20/03/2021 15:43 54748 fflate.js
-a---- 21/03/2021 00:57 14322 fflate.js.gz
-a---- 21/03/2021 00:57 16384 fflate.js.unzipped
For the binary file test, I generated a 32kb binary random file using dd
dd if=/dev/random of=LARGE_FILE ibs=1k count=32
Then compress it with the above code:
deno run --allow-all gzip.ts LARGE_FILE
This throws an error on the unzip:
error: Uncaught (in promise) invalid gzip data
And file reports a strange size on the compressed file:
LARGE_DATA.gz: gzip compressed data, last modified: Sun Mar 21 01:01:11 2021, max compression, from Unix, original size modulo 2^32 100822718
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
fflate/index.ts at master · 101arrowz/fflate - GitHub
If the `gunzip` command is used to decompress the data, it will output a file. * with this name instead of the name...
Read more >compress@v0.4.5 - Deno
Let's compress and uncompress a file. ( gzip only supports compressing and decompressing a single file.) stream mode. Useful for reading and writing...
Read more >zlib decompression invalid distances set - Stack Overflow
To reproduce the problem, I decompressed the .gz file with latest version of ... minigzip -d . ... That says it's not a...
Read more >bunzip2 - man pages section 1: User Commands
bunzip2 - sorting file compressor, v1.0.8 bzcat - decompresses files to stdout ... This guards against corruption of the compressed data, ...
Read more >Changelog - Zstd dev - DocsForge
cli : accept decompressing files with *.zstd suffix. cli : provide a condensed summary ... bug: Fix data corruption in niche use cases...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I tried this and you are right, it failed to work properly on Deno. However running a similar script on Node works just fine:
So I did a bit of digging and as it turns out, Deno reuses the buffer each time you read a chunk! That means that
prevBlock
is exactly the same reference asblock
, meaning that you skip the first chunk entirely and duplicate the last chunk. So to fix it, you need to either copy the buffer after each iteration (yuck!) or use thenew Uint8Array(0)
trick (much faster). Hope that helps.By the way, I noticed you were using
fflate
untyped, but you can easily add TypeScript support like this: