Feature: Share cache across pipelines
See original GitHub issueRequired Information
Type: Feature
Enter Task Name: Cache (https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/cache?view=azure-devops / https://github.com/microsoft/azure-pipelines-tasks/tree/master/Tasks/CacheV2)
Environment
- Azure Pipelines (yml)
- Can provide account / build / pipeline information privately
- Agent - Hosted vs2017-win2016
Issue Description
I have a number of pipelines that run on the same hosted agent (around 50), for the same repository. All the pipelines use the same templates with lots of pipeline specific variables. I use the following cache task in a build template:
- task: Cache@2
condition: and(succeeded(), eq('${{ parameters.enableNugetCaching }}', true))
displayName: NuGet Cache
inputs:
key: 'nuget | "${{ parameters.solution }}" | "$(Agent.OS)" | ${{ parameters.nugetConfig }},**/packages.config,!**/bin/**,!**/obj/**'
restoreKeys: |
nuget | "${{ parameters.solution }}" | "$(Agent.OS)"
nuget | "${{ parameters.solution }}"
path: '${{ parameters.nugetPackagesDirectorySource }}'
Caching works perfectly when I run a single pipeline twice - it restores the expected cache, and creates a new cache if one of the matched files changes, as you would expect.
However, when I run another pipeline on the same repo / set of code, the cache generates exactly the same key, but does not “match” against the existing cache - it generates a new one. As this cache is around 600mb in size, the “pipeline cache” for the Azure DevOps organisation is now around 30Gb when it should be 600mb. I would expect extra caches to only generate when the source input files change, and the old caches to expire after 30 days (would be handy for this to be customised too, but that’s not as important). It also means that all 50 pipelines take around 2 mins extra each, eating into the pipeline minutes.
I’ve attached a screenshot of a compare between the two cache job runs in two separate pipelines. Everything except the X-TFS-Session identifier are exactly the same. Ideally, the cache key should be shared between pipelines that run on the same repo.
In addition, if I was able to share this cache between pipelines I would be able to reduce the build by another 2 or 3 mins as I could cache the main solution binaries too - which don’t change that often. It’s an odd scenario, but one which suits this particular client’s build requirements perfectly.
I’ve also posted here:
https://developercommunity.visualstudio.com/idea/1030422/share-cache-across-pipelines.html
Task logs
Issue Analytics
- State:
- Created 3 years ago
- Reactions:17
- Comments:10 (1 by maintainers)
This pipeline/branch scoping makes cache task extremely inefficient. In my environment, CI builds unable to use cache produced by PR build (different pipelines). Literally, pipelines produce and store all this cached data for nothing.
Don’t forget if you say “Cache per Pipeline” that is wrong for me, because actually it’s "cache per job per pipeline. I can repeatable prove that running a sequence of 3 jobs where one and 3 using maven caching under the same key, i don’t see items be restored at job 3 which has been produced (and cached) at job 1 even although job 2 in between is taking at least 5 minutes and 2 deps on 1 and 3 on 2 So what is really the reliable definition for a minimum hit rate if its even not fulfilled on “same” pipeline for a dependent sequence of jobs. Or is it a bug ?