question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question: How to get XZ uncompressed size

See original GitHub issue

Hello, as far as I know XZ format has index section which contains archive metadata (most notably - uncompressed size).

I’ve skimmed through XZ implementation in this package and looks like sharpcompress can read XZ index, but it’s impossible to get XZBlock information without reading and decompressing whole archive contents.

How can I get XZ index information using this library without extracting archive contents?

It would nice to have to populate uncompressed stream size in Length property.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
adamhathcockcommented, Jun 4, 2021

Zip has the same issue with streamed files where you don’t know the size before compression.

We should be able to implement this size on XZ when using Archive strategy but not Reader strategy

0reactions
x1unixcommented, Jul 13, 2021

@adamhathcock here is a simple snippet to calculate uncompressed size of XZ contents. Hope it helps.

Works only with seekable streams. For non-seakable streams, a whole file should be read before.

public class XzFileInfo
    {
        private const int XzHeaderSize = 12;
        public static ulong GetUncompressedSize(string filePath)
        {
            using var file = File.Open(filePath, FileMode.Open);

            // Read the footer from the end. Footer size is 12 bytes according to the spec.
            file.Seek(-XzHeaderSize, SeekOrigin.End);
            var footer = XZFooter.FromStream(file);
            Debug.WriteLine($"BackwardSize: {footer.BackwardSize}");

            // Get xz index offset from BackwardSize and seek to it.
            file.Seek(-(XzHeaderSize + footer.BackwardSize), SeekOrigin.End);
            var index = XZIndex.FromStream(file, false);
            Debug.WriteLine($"Index: number of records - {index.NumberOfRecords}");

            // Calculate total uncompressed size of each block. 
            var size = index.Records.Select(r => r.UncompressedSize).Aggregate((acc, x) => acc + x);
            Debug.WriteLine($"Total size of uncompressed archive: {UnitFormatter.FormatByteSize(size)} ({size} bytes)");
            return size;
        }
    }
Read more comments on GitHub >

github_iconTop Results From Across the Web

Get final uncompressed size · Issue #15 · addaleax/lzma- ...
By taking a closer look, this is calculated by running the lzma_index_uncompressed_size() function on a lzma_index populated by a quite big ...
Read more >
How can I get the uncompressed size of gzip file without ...
Unfortunately, the only way to know is to extract it and count the bytes. gzip files do not properly report uncompressed data >4GB...
Read more >
Fastest way of working out uncompressed size of large ...
I believe the fastest way is to modify gzip so that testing in verbose mode outputs the number of bytes decompressed; on my...
Read more >
xz(1) - Linux man page
lzma files created by xz will use end of payload marker and have uncompressed size marked as unknown in the .lzma header. This...
Read more >
Linux xz, unxz, xzcat, lzma, unlzma, lzcat command
The . xz headers store the dictionary size either as 2^n or 2^n + 2^(n-1), so these sizes are somewhat preferred for compression....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found