maximise lossless compression of vcf_to_zarr
See original GitHub issueMoving and storing large VCF files is a problem. One option is to transform them into a more compact data format, which contains enough data to fully restore the VCF at a later date (see https://github.com/pystatgen/sgkit/issues/924). The default vcf_to_zarr reduces the size of the data significantly, however, when extra fields are included, e.g. ["INFO/*", "FORMAT/*"]
the resulting zarr structure can be larger than the original VCF.
Issue Analytics
- State:
- Created a year ago
- Comments:18
Top Results From Across the Web
Lossless Compression: Maximizing Framerates and ... - FLIR
Lossless Compression is a feature available on select Teledyne FLIR GigE machine vision cameras to delivers up to 170% increased frame rates ...
Read more >Optimizing Data Transfer Using Lossless Compression with ...
In this post, we introduced a new NVIDIA library that lets you easily optimize your GPU data transfers by using efficient parallel compression...
Read more >What are Lossless and Lossy Compression? - TechTarget
Lossless and lossy compression describe whether original data can be recovered when a file is uncompressed. Learn the pros and cons of each...
Read more >Lossless compression - Wikipedia
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no...
Read more >Lossy vs Lossless Image Compression: What's the Difference?
It all boils down to which lossy or lossless algorithm is used to optimize each image. Plus, your specific needs. Read the article,...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
With #943 I was able to get the Zarr size down to about 16% larger than genozip on the test file (using bzip2).
@ravwojdyla pointed out that #80 is related.