question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Text files served from cache with etags are not compressed with gzip or brotli

See original GitHub issue

v0.0.11 added support for cache revalidation via etags. Per RFC2616, Cloudflare does not alter responses from cache that have a strongly-held etag header, so resources that are not gzipped or brotli compressed by the origin server (or kv-asset-handler) will either have their etags stripped or remain in their uncompressed state when delivered to a client.

Some requirements for implementing compression on text files:

  • Must be able to serve files in both uncompressed and compressed form based on the client’s Accept-Encoding request header
  • Must provide gzip compression (brotli is a nice-to-have)
  • Must include content-type and content-encoding response headers upon cache insertion
  • Each variant of content-encoding will need to produce a corresponding etag variant
  • Must detect mime type of assets to determine which files can be compressed

Compression can be handled in one of three ways:

  1. Apply compression on-the-fly in Workers. This would probably require an NPM package and is the least resource-efficient option available
  2. Allow Cloudflare CDN to apply compression on-the-fly. This would require modifying both the format of the etag and the comparison validation function.
  3. Apply compression during upload to KV. This is probably the best solution and certainly the most resource efficient

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:19 (17 by maintainers)

github_iconTop GitHub Comments

5reactions
shagamemnoncommented, Jul 1, 2020

@SukkaW - can you please provide a detailed bug report if you believe the etag is not working as expected? Please bear in mind that:

  • The cache-control headers on the response are what dictate caching behavior. Setting cache-control: no-cache on the request does not alter the outcome of the response
  • There are significant differences between strong and weak etags. You can read how Chrome handles resources with strong etags here: https://stackoverflow.com/questions/43659756/chrome-ignores-the-etag-header-and-just-uses-the-in-memory-cache-disk-cache, which is the behavior exhibited in all documented tests to date against kv-asset-handler
  • Strong etags can still be served to the client from Cloudflare regardless of the page rule setting. We need to update the KB article you referenced, as the etag will be retained, even when double quotes are removed, on all plans

For a point of reference, use this site: https://free.dinos.network/. It’s on the free plan, has no cache settings set in page rules and – importantly – sends cache-control: must-revalidate, max-age=0 on the response. As outlined in the Stack Overflow post, this is the setting you need to set to get back consistent 304 responses from a remote server when using strong etags 😃

2reactions
SukkaWcommented, Jul 3, 2020

@shagamemnon

Did you look at my scenarios closely? Please do so; you’ll see that the outcome is reached because Cloudflare doesn’t support deflate compression.

In other words, we know exactly why this scenario doesn’t work: Cloudflare doesn’t support the deflate compression algorithm.

I do know why ETag is retained when Accept-Encoding: deflate is presented: because Cloudflare won’t apply ANY compression (deflate is not supported).

So far here is what I found:

When Cloudflare trying to run compression using gzip or br:

  • Origin’s Strong ETag will be removed.
  • Origin’s Strong ETag wrapped in quote will be transformed into Weak ETag (Prefix W/ is added).
  • Origin’s Weak ETag will be retained.

“Origin” can be origin server or workers. I tested, their behavior are the same when handling ETag.

Here Cloudflare Web Server is doing the right thing: After compression using gzip/br the response will no longer be exactly identical, that’s why Strong ETag wrapped in quote will be transformed into Weak ETag (and Origin’s Weak ETag could be retained).

If Cloudflare CDN receives the weakly validated version of the etag like W/“index.fa99522e2f.html”, it sends to origin the strong etag “index.fa99522e2f.html” for validation.

The origin’s response with Strong ETag (before compression is applied to the response and ETag being transformed into Weak Tag) will be stored into Cloudflare Cache. That’s why Cloudflare will sends Strong ETag to origin server for validation, am I right?

As such, when you send if-none-match: W/“index.fa99522e2f.html”, the result is in an empty response.

Don’t blame Cloudflare Cache, blame yourselves. Cloudflare Cache want to return a 304 response, but kv-asset-handler just use the response.body (which in the case is empty) to build a 200 response:

https://github.com/cloudflare/kv-asset-handler/blob/deda2296c920405bd377c3d17394ac611a49e548/src/index.ts#L164

https://github.com/cloudflare/kv-asset-handler/blob/deda2296c920405bd377c3d17394ac611a49e548/src/index.ts#L202

If kv-asset-handler uses response object returned by Cloudflare Cache directly, it will be a empty response with 304 status code.

https://github.com/cloudflare/kv-asset-handler/blob/deda2296c920405bd377c3d17394ac611a49e548/src/index.ts#L177-L182

In short, empty response is caused by shouldRevalidate algorithm is different compared with Cloudflare Cache, not caused the format of ETag.

There are ways to mitigate this problem in your code, but it’s not easy and fraught with bugs outside of the scope of this discussion.

No, it is very easy. Cloudflare Cache caches.default.match will return a response object, just use it directly instead of building a new response, then you will have correct status code (which is 304).

KV has practically unlimited storage space, there is no technical or practical limitations to creating versions with different encodings.

Blame Cloudflare Staffs (your guys) again.

  • Workers official website says:

image

  • KB says:

https://developers.cloudflare.com/workers/about/limits/#kv

image

  • Worker Dashboard says:

image

Read more comments on GitHub >

github_iconTop Results From Across the Web

Serving compressed files - Amazon CloudFront
CloudFront can compress objects using the Gzip and Brotli compression formats. ... If the compressed object is not in the cache, CloudFront forwards...
Read more >
Enable dynamic compression | Cloud CDN - Google Cloud
Dynamic compression works with a global external HTTP(S) load balancer (classic) to automatically compress responses served by Cloud CDN between the origin ...
Read more >
Lighthouse recommends compression even when enabled
Text -based resources should be served with compression (gzip, deflate or brotli) to minimize total network bytes. Learn more.
Read more >
Cant cache resource when having both gzip and Etag
But no, I keep getting 200 OK, which means that apache keeps serving the file (albeit compressed) every time. Tested with Firefox, Chrome,...
Read more >
Chapter 10. Fine-tuning asset delivery - Web Performance in ...
If you encounter a file type that you're not sure is compressible, ... TTFB performance of gzip versus Brotli when compressing the jQuery...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found