question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

maximise lossless compression of vcf_to_zarr

See original GitHub issue

Moving and storing large VCF files is a problem. One option is to transform them into a more compact data format, which contains enough data to fully restore the VCF at a later date (see https://github.com/pystatgen/sgkit/issues/924). The default vcf_to_zarr reduces the size of the data significantly, however, when extra fields are included, e.g. ["INFO/*", "FORMAT/*"] the resulting zarr structure can be larger than the original VCF.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:18

github_iconTop GitHub Comments

1reaction
tomwhitecommented, Oct 25, 2022

With #943 I was able to get the Zarr size down to about 16% larger than genozip on the test file (using bzip2).

0reactions
tomwhitecommented, Oct 13, 2022

@ravwojdyla pointed out that #80 is related.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Lossless Compression: Maximizing Framerates and ... - FLIR
Lossless Compression is a feature available on select Teledyne FLIR GigE machine vision cameras to delivers up to 170% increased frame rates ...
Read more >
Optimizing Data Transfer Using Lossless Compression with ...
In this post, we introduced a new NVIDIA library that lets you easily optimize your GPU data transfers by using efficient parallel compression...
Read more >
What are Lossless and Lossy Compression? - TechTarget
Lossless and lossy compression describe whether original data can be recovered when a file is uncompressed. Learn the pros and cons of each...
Read more >
Lossless compression - Wikipedia
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no...
Read more >
Lossy vs Lossless Image Compression: What's the Difference?
It all boils down to which lossy or lossless algorithm is used to optimize each image. Plus, your specific needs. Read the article,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found