question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`poetry install` does not populate cache

See original GitHub issue
  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).
  • OS version and name: macOS 10.14.6
  • Poetry version: 1.0.3
  • Contents of your pyproject.toml file:
[tool.poetry]
name = "poetry-test"
version = "0.1.0"
description = ""
authors = ["redacted"]

[tool.poetry.dependencies]
python = "^3.6"
pathlib2 = "^2.3.5"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

Issue

poetry install does not populate the filesystem cache, but poetry add does. Aside from being a little surprising, this makes caching on CI machines effectively impossible since they only ever install and the cache never changes, so they always have to start with a blank slate.

A shell session demonstrating the issue:

$ poetry init

# ...snip...

Generated file

[tool.poetry]
name = "poetry-test"
version = "0.1.0"
description = ""
authors = ["<redacted>"]

[tool.poetry.dependencies]
python = "^3.6"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"


Do you confirm generation? (yes/no) [yes] yes

$ ls /Users/seankelley/Library/Caches/pypoetry/cache
# no output: the cache directory starts empty

$ poetry add pathlib2
Creating virtualenv poetry-test-D6aiWa4y-py3.6 in /Users/seankelley/Library/Caches/pypoetry/virtualenvs
Using version ^2.3.5 for pathlib2

Updating dependencies
Resolving dependencies... (0.7s)

Writing lock file


Package operations: 2 installs, 0 updates, 0 removals

  - Installing six (1.14.0)
  - Installing pathlib2 (2.3.5)

# the cache directory has some stuff in it from pypi
$ ls /Users/seankelley/Library/Caches/pypoetry/cache/repositories
pypi

# clear the cache manually
$ rm -r /Users/seankelley/Library/Caches/pypoetry/cache/repositories

# delete the virtualenv too for good measure
$ rm -r /Users/seankelley/Library/Caches/pypoetry/virtualenvs/poetry-test-D6aiWa4y-py3.6

$ poetry install
Creating virtualenv poetry-test-D6aiWa4y-py3.6 in /Users/seankelley/Library/Caches/pypoetry/virtualenvs
Installing dependencies from lock file


Package operations: 2 installs, 0 updates, 0 removals

  - Installing six (1.14.0)
  - Installing pathlib2 (2.3.5)


$ ls /Users/seankelley/Library/Caches/pypoetry/cache
# no output: the cache directory is empty even though we just installed stuff!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:17
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

7reactions
seansfkelleycommented, Jun 21, 2021

I don’t really understand the relevance of these parts of the comment:

poetry install will only use the information present in the lock file for installation.

(and)

The lock file is the source of truth for poetry install so that’s what should be tracked for changes.

This issue is about caching, not correctness. I’m not objecting to what poetry install produces or how that relates to the lockfile (because it’s correct), I’m just observing that it doesn’t include side effects that I expected and appear to be helpful for CI optimization.

As for the rest of your comment:

The data in {cache-dir}/cache/repositories is the cache for the remote metadata so I am not sure why you would want it when doing poetry install.

It’s been a long time since I’ve written this issue so perhaps I understood the cache wrong or the behavior has changed, but IIRC, the presence of that cache dramatically improved install times. As for why I want it, I said so in the original comment:

…this makes caching on CI machines effectively impossible since they only ever install and the cache never changes, so they always have to start with a blank slate.

To clarify further, I’m using Docker-based builds (in Travis), so they do the thing where they archive a bunch of files on disk so you can unarchive them in later builds in fresh Docker containers, thereby skipping a bunch of work for that later build.

If there is another disk location that would be helpful for CI caching instead of or in addition to that one, that would be helpful to know. I don’t see any documentation along these lines.

It’s also worth noting that I opened this issue before the addition of --remove-untracked, and I specifically did not want to cache the installed virtualenv because there was not yet a mechanism to remove any dependencies that existed in the Travis-provided archive but had been removed for the current build, thereby making the build impure. That should not be a problem now, but unfortunately I don’t have any sizable projects I can try it out on to verify any performance improvements. That said, I imagine that the cached metadata can still help improve performance, especially in the case where it’s expensive to retrieve.

4reactions
sdispatercommented, Jun 22, 2021

It’s been a long time since I’ve written this issue so perhaps I understood the cache wrong or the behavior has changed, but IIRC, the presence of that cache dramatically improved install times. As for why I want it, I said so in the original comment:

The data in {cache-dir}/cache/repositories is only relevant for the dependency resolution process not the installation process. Basically, it contains cached remote metadata information. Why it’s no longer relevant at the installation time is because the lock file contains this metadata information and, as such, the cache no longer needs to be consulted.

That being said, there is a cache that matters for the installation and it’s {cache-dir}/artifacts which holds the cached distributions to avoid downloading them again on subsequent installations.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Commands | master | Documentation | Poetry - Python ...
--no-cache : Disables Poetry source caches. ... If there is no poetry.lock file, Poetry will create one after dependency resolution.
Read more >
How to cache poetry install for GitHub Actions - Stack Overflow
It seems that installation was cached (cache size ~ 26MB). However, they couldn't be retrieved after cache hit. The cache installed libraries ...
Read more >
Tips and Tricks - Caching poetry install for CI - TestDriven.io
When running poetry inside a CI pipeline, set virtualenvs.in-project config to true . That way the virtual environment will be created inside the...
Read more >
Dependency Management With Python Poetry
Learn how Python Poetry will help you start new projects, maintain existing ones, and master dependency management.
Read more >
Caching Dependencies - CircleCI
With dependencies restored from a cache, commands like yarn install need only download ... Each line in the keys: list manages one cache...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found