Support reading multi-member gzip files or providing access to remaining data
See original GitHub issueIssue Description
What can’t you do right now?
Gzip supports having ‘multi-member’ gzip files, where essentially gzip files are concatenated one after the other. This is used in certain formats, such as WARC
An optimal solution
An optimal solution would be for fflate Inflate to provide a way parse mutli-member gzip files by providing an option, and an additional callback when a new member is started (as well as the offset of the member into the stream).
Another option is to provide an offset into the buffer consumed by reading the gzip, allowing the developer to manually create a new Inflate object.
(How) is this done by other libraries?
pako provides a avail_in
counter which keeps track of how many bytes have not yet been consumed.
One approach I’ve used is something like this:
https://github.com/webrecorder/warcio.js/blob/main/src/readers.js#L282 (though this is with an earlier version of pako).
Pako in latest version may try to read the multi-member gzips as one buffer, though it seems it doesn’t always work (in my tests)
A key my use case is to be able to get an offset to the beginning of each member, and flush the data buffer at the end of each member.
Ideally, there could be a callback that indicates when a new member has been started and the offset at that new member:
onnewgzipmember: OnNewGzipMemberCallback
OnNewGzipMemberCallback = (offset: number) => void
The ondata callbacks after onnewgzipmember are assumed to be from the gzip member, and ondata always flushes when the member boundary is reached.
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:7 (3 by maintainers)
I had forgotten about this issue but what you’ve proposed does seem compelling. I’ll look into the
bgzip
spec and try to implement it for the next release.I’ll preface by saying that this is a very niche use case, so even if it is easy to implement I’ll have to weigh the bundle size costs to see if it’s worth adding. That being said, this might be possible to add to the streaming API, i.e.
fflate.Gunzip
. I’ll let you know if it seems feasible when I can.