Allow more aggressive transform caching in multiproject monorepos
See original GitHub issueš Bug Report
In multiproject configs, each project root gets its own jest transform cache. This leads to duplicate work transforming the same files, even if they use the same configs. If a file is used in n projects, it will be transformed n times.
Locally, this increases cache disk usage. In CI, where the cache will not be warm, it increases runtime as duplicate work is required.
Writing a custom transformer with a simpler cache key implementation does not solve this - each project gets a separate cache folder.
The relevant code that does this is in https://github.com/facebook/jest/blob/c98b22097cb6faa3ed3fabf197cbe4f466620b9f/packages/jest-transform/src/ScriptTransformer.ts#L132-L136 - forces a unique cache path per config.name
If unassigned, config.name
is assigned to a hash based on the path and index.
Iāve tried adding a common name
to all projectsā jest configs. That fixes the transform problem, but breaks other things (manual mocks in an __mocks__
folder donāt work consistently). On our large monorepo, this gave a ~30% improvement in total runtime, but __mocks__
becoming unpredictable
I appreciate that there are edge cases to handle here (potentially different jest configs could warrant a different cache), but I think it should be available for the jest transformer to decide whether this is important (e.g. if relevant, a transformer could include config.name in the cache key manually).
e.g. optionally allow transformers to provide their own implementations of getCacheFilePath
, which overrides the use of HasteMap.getCacheFilePath( this._config.cacheDirectory, 'jest-transform-cache-' + this._config.name, VERSION, )
If this sort of change would be accepted, I can probably provide a PR.
To Reproduce
Steps to reproduce the behavior:
- Set up a multiproject config
- Run tests with a cleared cache
- Either observe the cache on disk, or tap into the transform to count the number of times a file is transformed
Expected behavior
If the transform config is the same, each file is only transformed once
Link to repl or repo (highly encouraged)
https://github.com/lexanth/jest-projects-repro
This is a monorepo with 3 packages (A, B and C). A and B consume C. the code in C currently gets transformed once per package, even with the transformer (in the jest-preset package) giving a super aggressive cache key implementation (yarn test:ci
- could be used e.g. in CI, if we know the other relevant configs are constant).
Adding name: process.env.USE_SIMPLIFIED_CACHE ? '_' : undefined
to each packageās jest config makes them all use the same cache, but in my actual repo breaks other things, being a bit of a hack.
Everything is running in band because the tests are so fast that multiple workers all start transforming before another can populate the cache anyway.
envinfo
System:
OS: macOS 10.15.7
CPU: (16) x64 Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Binaries:
Node: 12.18.1 - ~/.nvm/versions/node/v12.18.1/bin/node
Yarn: 1.22.10 - /usr/local/bin/yarn
npm: 6.14.5 - ~/.nvm/versions/node/v12.18.1/bin/npm
npmPackages:
jest: ^26.6.3 => 26.6.3
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (2 by maintainers)
Dunno. PR welcome? š
I donāt understand the āletās be smarter about busting in transformersā comments - even if the cache key is the same itās a cache miss since Jest will look in different directories for different projects when checking if the cached file exist. If you by āinstead of relying on one-size-fits-all solutions like stringified config or project nameā mean āremove project name from the algorithmā that has nothing to do with the transformers themselves. That code lives in
@jest/transform
. Iām down with just removing that part of it which should solve it as weāll be trusting the cache key from transformers.Making
getCacheKey
ofbabel-jest
āsmarterā is orthogonal to this issue (although I agree it should be done) as this issue is about unlocking the ability of transformers to be āsmarterā at all - any update we make togetCacheKey
would be void since@jest/transform
wouldnāt get cache hits regardless