Core D2 dependency filename changes between builds breaks Netlify indexing
See original GitHub issue🐛 Bug Report
Docs/files that are otherwise unchanged between builds are marked as changed when the runtime~main.<hash>.js
filename changes. This occurs since all generated HTML files import runtime~main.<hash>.js
. Since static site hosts like Netlify rely on file hashes for indexing this results in files incorrectly getting marked as changed between builds, which can greatly increase the overall build/deploy time. Our team noticed this behavior after our D2 site grew beyond 1K docs.
Have you read the Contributing Guidelines on issues?
Yes.
To Reproduce
(Write your steps here:)
- Build your D2 site, i.e.
yarn run build
. - Dump file hashes of generated files/HTML in build dir, e.g.
cd build && find . -type f -exec md5 "{}" \; | sort
- Change/edit/add any file, e.g. doc, page, blog, etc.
- Repeat steps 1 and 2.
- Compare output of steps 2 and 4 using a diff tool and notice all the additional files that have changed, e.g.
code --diff before_change_hashes.txt after_change_hashes.txt
(example using vscode)
Expected behavior
Only docs/files that were intentionally changed between builds should be modified.
Actual Behavior
All static HTML files that import any or all of the following dependencies are modified when dependency filenames change following a build. This appears to be caused when the <hash>
portion of each filename is changed between builds.
runtime~main.<hash>.js
. # changes whenever any doc/file is modified between buildsmain.<hash>.js
# dependent files could be modified if this filename changesstyles.<hash>.js
. # dependent files could be modified if this filename changesstyles.<hash>.css
. # dependent files could be modified if this filename changes- etc.
Depending on the size of the D2 site, this could potentially introduce many more modified files than expected between builds, which could render indexing by hosting/build sites like Netlify, GitHub Actions, et al., ineffective.
In the following screenshot, note all the changed files despite only ./docs/contributing/index.html
actually being modified:
Your Environment
- Docusaurus version used: observed on alpha 48, 58 and 61
- Environment name and version (e.g. Chrome 78.0.3904.108, Node.js 10.17.0): Node 14.4.0 and Netlify (Node v12).
- Operating system and version (desktop or mobile): Desktop, Mac OSX
Reproducible Demo
Can be reproduced on any D2 site.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:13 (5 by maintainers)
Thanks @slorber, we’ll begin testing ASAP. I appreciate the tips.
@glicht, FYI
@sserrata if you don’t use cache-control headers (like “immutable”) to hashed assets on your CDN, you can try to remove the hash from js filenames output.
On Netlify, it will still provide etags-based caching, which is not too bad imho (and I guess most users don’t even set more aggressive caching headers on their CDN)
If you use
configureWebpack()
, you can pass a config such as:As far as I see, the chunks under
/assets/...
will remain hashed (so you can still cache them aggressively) but the runtime/main files won’t change anymore (which means it’s unsafe to cache them aggressively, but it looks fine to me.If this setup works fine for you, I think we could make this a default for Docusaurus.
Docusaurus is not a typical webpack app: it has a single entrypoint for all the pages, so any page modification modify this entrypoint.