question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature: Share cache across pipelines

See original GitHub issue

Required Information

Type: Feature

Enter Task Name: Cache (https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/cache?view=azure-devops / https://github.com/microsoft/azure-pipelines-tasks/tree/master/Tasks/CacheV2)

Environment

  • Azure Pipelines (yml)
  • Can provide account / build / pipeline information privately
  • Agent - Hosted vs2017-win2016

Issue Description

I have a number of pipelines that run on the same hosted agent (around 50), for the same repository. All the pipelines use the same templates with lots of pipeline specific variables. I use the following cache task in a build template:

  - task: Cache@2
    condition: and(succeeded(), eq('${{ parameters.enableNugetCaching }}', true))
    displayName: NuGet Cache
    inputs:
      key: 'nuget | "${{ parameters.solution }}" | "$(Agent.OS)" | ${{ parameters.nugetConfig }},**/packages.config,!**/bin/**,!**/obj/**'
      restoreKeys: |
          nuget | "${{ parameters.solution }}" | "$(Agent.OS)"
          nuget | "${{ parameters.solution }}"
      path: '${{ parameters.nugetPackagesDirectorySource }}'

Caching works perfectly when I run a single pipeline twice - it restores the expected cache, and creates a new cache if one of the matched files changes, as you would expect.

However, when I run another pipeline on the same repo / set of code, the cache generates exactly the same key, but does not “match” against the existing cache - it generates a new one. As this cache is around 600mb in size, the “pipeline cache” for the Azure DevOps organisation is now around 30Gb when it should be 600mb. I would expect extra caches to only generate when the source input files change, and the old caches to expire after 30 days (would be handy for this to be customised too, but that’s not as important). It also means that all 50 pipelines take around 2 mins extra each, eating into the pipeline minutes.

I’ve attached a screenshot of a compare between the two cache job runs in two separate pipelines. Everything except the X-TFS-Session identifier are exactly the same. Ideally, the cache key should be shared between pipelines that run on the same repo.

In addition, if I was able to share this cache between pipelines I would be able to reduce the build by another 2 or 3 mins as I could cache the main solution binaries too - which don’t change that often. It’s an odd scenario, but one which suits this particular client’s build requirements perfectly.

I’ve also posted here:

https://developercommunity.visualstudio.com/idea/1030422/share-cache-across-pipelines.html

Task logs

Cache log comparisons

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:17
  • Comments:10 (1 by maintainers)

github_iconTop GitHub Comments

15reactions
gaikovoicommented, Oct 5, 2020

This pipeline/branch scoping makes cache task extremely inefficient. In my environment, CI builds unable to use cache produced by PR build (different pipelines). Literally, pipelines produce and store all this cached data for nothing.

1reaction
cforcecommented, Jun 16, 2021

Don’t forget if you say “Cache per Pipeline” that is wrong for me, because actually it’s "cache per job per pipeline. I can repeatable prove that running a sequence of 3 jobs where one and 3 using maven caching under the same key, i don’t see items be restored at job 3 which has been produced (and cached) at job 1 even although job 2 in between is taking at least 5 minutes and 2 deps on 1 and 3 on 2 So what is really the reliable definition for a minimum hit rate if its even not fulfilled on “same” pipeline for a dependent sequence of jobs. Or is it a bug ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pipeline caching - Azure - Microsoft Learn
To ensure isolation between caches from different pipelines and different branches, every cache belongs to a logical container called a scope.
Read more >
Caching in GitLab CI/CD
Share caches between jobs in the same branch ... This configuration prevents you from accidentally overwriting the cache. However, the first pipeline for...
Read more >
Cache reuse across DoFn's in Beam | Google Cloud | - Medium
Let us demonstrate this feature by building a Streaming data enrichment pipeline for a mock retail company. Example: Shared cache for stream ...
Read more >
Bitbucket pipeline caching dependencies across repository
Hi Nitin,. Pipelines currently does not support caching across repositories. An external artifactory or repository will be required. You can also open a...
Read more >
Continuous Integration Features - Harness
... enabling them to share the same remote cache. Remote Docker Layer Caching can dramatically improve build time by sharing layers across Pipelines, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found