question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Blocklists are sent uncompressed by Cloudflare reverse proxy

See original GitHub issue

Following up on an issue I originally pointed out here: https://github.com/celzero/rethink-app/issues/564#issuecomment-1250396445

The lack of compression means users on metered connections have to download more than twice as much data as would otherwise be required for blocklists.

Test:

curl -sI --compressed https://download.rethinkdns.com/trie | grep content-encoding

Copying the blocklist file to a domain I control with gzip enabled in the Nginx config shows that the command works correctly.

The issue seems to be Cloudflare’s process for deciding which files to compress, which relies on mimetype. Since the blocklist file (correctly) doesn’t have any mimetype indicating compressibility, Cloudflare will decompress the file sent from the origin server and send it to the client uncompressed.

Some possible workarounds:

  • If all RethinkDNS versions and platforms support HTTP compression (likely), then you can set cache-control: no-transform on your origin server and Cloudflare should cache the gzipped file unaltered. You might also look at something like gzip_static for Nginx which would allow you to get the best compression ratios with e.g. Zopfli.

  • You could rely on decompression in the client application rather than at the protocol level. You might consider using something like zstd, which would give better compression ratios than gzip and might actually be faster than doing it at the protocol level due to extremely fast decompression for zstd.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:7

github_iconTop GitHub Comments

2reactions
ignoramouscommented, Sep 29, 2022

Part of #573

Blocklists will be downloaded from ?compressed endpoint from here on. Thanks @afontenot for pushing us to reevaluate our choices and help Rethink be better! Appreciate it.

1reaction
ignoramouscommented, Sep 22, 2022

My internet is only 100 Mbps

100Mbps ought to be enough for anybody 😉

When the file is compressed on the fly, the content-length header isn’t set (because the server doesn’t know what the final size of the file will be). When you download without the ?compressed, it is sent and therefore you get a working progress bar (“5 seconds remaining”) in Firefox.

Yep. I can’t remember why but I believe that absence of content-length caused issues with Android’s Download Manager or some such, which is why we moved to application/octet-stream.

Thanks. I see the compressed output with wget (which is around ~30M) that doesn’t lie unlike Firefox does.

wget --compression=auto "https://download.rethinkdns.com/trie?compressed=true" -a /tmp/wget.log && less /tmp/wget.log

-2022-09-22 01:42:07--  https://download.rethinkdns.com/trie?compressed
HTTP request sent, awaiting response... 200 OK
Length: 63409270 (60M) [application/wasm]
Saving to: 'trie?compressed'
...
...
...
31750K .......... .......... .......... ....                  10.8M=8.0s

2022-09-22 21:45:01 (3.87 MB/s) - 'trie?compressed' saved [63409270]

Zstd supports streaming decompression so you can just decompress the bytes as they’re downloaded over the wire.

The problem is the native zstd lib (we’d have to bundle one for each arch), the jni-overhead, and the additional code that we’d have to write and maintain… 😄 The AOSP repositories do have zstd built-from-source, but it is unclear if one can dynamically link to it.

I’ll switch to downloading from the ?compressed endpoint on the app for v053k (the next release due in a few days) to see how it goes.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is a reverse proxy? | Proxy servers explained - Cloudflare
A reverse proxy protects web servers from attacks and can provide performance and reliability benefits. Learn more about forward and reverse proxies.
Read more >
Stop reverse proxy scarping - Security - Cloudflare Community
Hi, Someone is scraping my site using reverse proxy. It seems to be updated almost on the fly when I update mine. I...
Read more >
Uploads to my website stall if using cloudflare reverse proxy
A 520 error occurs when the connection started on the origin web server, but that the request was not completed. The most common...
Read more >
Using Spamhaus Blocklists with Cloudflare Public Resolver ...
Their reasoning is that when a lookup to the Spamhaus block list is performed and queried by a public DNS resolver (such as...
Read more >
How Cloudflare works · Cloudflare Fundamentals docs
Cloudflare does this by serving as a reverse proxy Open external link for your web traffic. All requests to and from your origin...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found