question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Docker Buildkit caching?

See original GitHub issue

Proposed change

It would be wonderful to use the Docker buildkit caching capabilities. These enable incremental addition of packages, so one can add a single package to lists like requirements.txt. That invalidates the standard Docker layer caching, but the layer is quickly rebuilt because all the compilation without triggering entire rebuilds. The actual building happens on a special Docker container (which retains the caches).

https://docs.docker.com/develop/develop-images/build_enhancements/

This question includes links to examples for python building (but points out that it doesn’t work for R building (which apparently is going to need something like renv to work).

https://stackoverflow.com/questions/59253392/using-docker-buildkit-caching-with-r-packages

Alternative options

I don’t know enough to know if repo2docker is already doing something awesome to reuse compilation here. Perhaps a Docker layer per package build? I wonder if that would cause issues, though.

Who would use this feature?

Anyone adding a package would benefit from much quicker rebuilds. Should also help with builds in a place like mybinder.

How much effort will adding it take?

I haven’t yet looked at the repo2docker build code, so I don’t know. Mea Culpa. Biggest issue is that this requires fairly recent Docker and changes the build process so that builds happen in a Docker container rather than on the host.

Who can do this work?

I could help test, but have not yet dived into the repo2docker code.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jameshowisoncommented, Apr 24, 2020

Not sure what to do about the docker-py and buildx issues, but I did manage to get renv working with docker buildx (after stumbling around for quite a while 😃

https://github.com/howisonlab/test_repo_buildx_renv

I tried to keep things as close to REES by using an unchanged install.R file 😃

0reactions
manicscommented, Apr 29, 2021

One way this might work is to take what we already have, and:

unarchive what we have in a tempdir run buildx build in the tempdir

That’s pretty much what I’m doing in https://github.com/manics/repo2docker-podman/ https://github.com/manics/repo2docker-podman/blob/225265a8c09733250eb1a21efae4c12c7bf35e57/repo2podman/podman.py#L281-L288 and https://github.com/manics/repo2shellscript

Would this be a good use case for https://github.com/jupyterhub/repo2docker/pull/848 (both my above projects rely on it), and putting buildx in a new engine?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Image rebase and improved remote cache support in new ...
BuildKit supports many cache backend but the easiest, in this case, is to use “inline cache” that just embeds the build cache information...
Read more >
Faster CI Builds with Docker Layer Caching and BuildKit
Docker caches each layer as an image is built, and each layer will only be re-built if it or the layer above it...
Read more >
How to Speed Up Your Dockerfile with BuildKit Cache Mounts
BuildKit, a new build engine shipped with Docker, introduced a build-time cache mounts feature, which can be used to avoid long download times...
Read more >
moby/buildkit: concurrent, cache-efficient, and Dockerfile ...
BuildKit is a toolkit for converting source code to build artifacts in an efficient, expressive and repeatable manner. Key features: Automatic garbage ...
Read more >
Speed up your Docker builds with --cache-from - Florin Lipan
Using the Docker cache efficiently can result in significantly faster build times. In some environments though, like CI/CD systems, individual builds happen ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found