question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

java.util.zip.DataFormatException: invalid block type in JdkZlibDecoder for large gzip stream

See original GitHub issue

When downloading a large gzip stream, Inflater throws a DataFormatException when using HttpContentDecompressor, but not when using a GZIPInputStream.

Expected behavior

No exception.

Actual behavior

Exception in thread "main" io.netty.handler.codec.compression.DecompressionException: decompression failure
	at io.netty.handler.codec.compression.JdkZlibDecoder.decode(JdkZlibDecoder.java:273)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.embedded.EmbeddedChannel.writeInbound(EmbeddedChannel.java:343)
	at io.netty.handler.codec.http.HttpContentDecoder.decode(HttpContentDecoder.java:264)
	at io.netty.handler.codec.http.HttpContentDecoder.decodeContent(HttpContentDecoder.java:171)
	at io.netty.handler.codec.http.HttpContentDecoder.decode(HttpContentDecoder.java:160)
	at io.netty.handler.codec.http.HttpContentDecoder.decode(HttpContentDecoder.java:47)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at com.nextlisten.commoncreeper.Temp3$1.channelRead(Temp3.java:27)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1368)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1280)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:831)
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
		at reactor.core.publisher.Mono.block(Mono.java:1703)
		at com.nextlisten.commoncreeper.Temp3.main(Temp3.java:38)
Caused by: java.util.zip.DataFormatException: invalid block type
	at java.base/java.util.zip.Inflater.inflateBytesBytes(Native Method)
	at java.base/java.util.zip.Inflater.inflate(Inflater.java:378)
	at io.netty.handler.codec.compression.JdkZlibDecoder.decode(JdkZlibDecoder.java:240)
	... 56 more

Full logs: logs.txt

Steps to reproduce

To verify that the stream itself isn’t corrupt, we can do the following:

  • Get body as an inputStream and wrap it in a GZIPInputStream
  • Reading the entire stream from the GZIPInputStream is successful.

Minimal yet complete reproducer code (or URL to code)

This code fails:

public class Temp3 {
    public static final void main(final String... args) {
        final HttpClient httpClient = HttpClient.create()
                .doOnConnected(c -> {
                    c.addHandlerFirst(new HttpContentDecompressor());
                    c.addHandlerFirst(new ChannelInboundHandlerAdapter() {
                        @Override
                        public void channelRead(final ChannelHandlerContext ctx, final Object msg) {
                            if (msg instanceof HttpResponse response && response.status().code() >= 200) {
                                response.headers()
                                        .set(HttpHeaderNames.CONTENT_ENCODING, HttpHeaderValues.GZIP)
                                        .set(HttpHeaderNames.CONTENT_TYPE, HttpHeaderValues.TEXT_PLAIN);
                            }

                            ctx.fireChannelRead(msg);
                        }
                    });
                })
                .compress(true);

        httpClient
                .get()
                .uri("https://commoncrawl.s3.amazonaws.com/cc-index/collections/CC-MAIN-2021-25/indexes/cdx-00000.gz")
                .responseSingle((httpClientResponse, byteBufMono) -> byteBufMono.asString(StandardCharsets.UTF_8))
                .doOnNext(s -> System.out.println(s.substring(0, 1000)))
                .block();
    }
}

This code succeeds:

public class Temp4 {
    public static final void main(final String... args) throws IOException {
        final HttpClient httpClient = HttpClient.create();
        final InputStream is = httpClient.get()
                .uri("https://commoncrawl.s3.amazonaws.com/cc-index/collections/CC-MAIN-2021-25/indexes/cdx-00000.gz")
                .responseSingle((httpClientResponse, byteBufMono) -> byteBufMono.asInputStream())
                .block();

        try (final GZIPInputStream gis = new GZIPInputStream(is);
                final CountingOutputStream cos = new CountingOutputStream()) {
            gis.transferTo(cos);
            System.out.println(cos.size);
        } finally {
            is.close();
        }
    }

    private static class CountingOutputStream extends OutputStream {
        private long size = 0;

        @Override
        public void write(final int b) throws IOException {
            size++;
        }
    }
}

Netty version

  • netty-codec-http: 4.1.65.Final
  • netty-transport: 4.1.65.Final
  • reactor-netty-http: 1.0.8

JVM version (e.g. java -version)

openjdk version “16” 2021-03-16 OpenJDK Runtime Environment (build 16+36-2231) OpenJDK 64-Bit Server VM (build 16+36-2231, mixed mode, sharing)

OS version (e.g. uname -a)

Windows 10 Pro 21H1 (build 19043.1083)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (10 by maintainers)

github_iconTop GitHub Comments

3reactions
normanmaurercommented, Jul 27, 2021

This is fixed by https://github.com/netty/netty/pull/11521 … I am currently looking into writing a test-case for it.

1reaction
idelpivnitskiycommented, Jul 22, 2021

After a quick look my guess is that we mess with indexes when we set more input for inflator. There is a risk that we push the same bytes twice if indexes are moved incorrectly between multiple decode invocations.

I did check that JZlibDecoder works fine with this input. So, the workaround will be to use -Dio.netty.noJdkZlibDecoder=true system property. JdkZlibDecoder requires deeper debugging.

Read more comments on GitHub >

github_iconTop Results From Across the Web

compress.inflate: DataFormatException: invalid block type #851
I'm trying to decompress a gzip'ed file: $ file /path/to/file.csv.gz ... DataFormatException: invalid block type java.util.zip.Inflater.
Read more >
java.util.zip.ZipException: invalid block type under JDK1.4.2
The reason was that the GZIPInputStream constructor consumes about 10 bytes. If you do a reset before the first read, under 1.4.2 those...
Read more >
GZip gives error: java.util.zip.ZipException: invalid block type ...
I am getting the following error : java.util.zip.ZipException: invalid block type at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164) at ...
Read more >
java.util.zip.ZipException: invalid block… - Apple Community
java.util.zip.ZipException: invalid block type. Hi,. We are keep getting this exception when we use mac High Sierra OS.
Read more >
java.util.zip.DataFormatException: invalid block type - Apache
Description. it gives "java.util.zip.DataFormatException: invalid block type" error while decompressing the stream using Inflater in FlateFilter.java.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found