question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Invalid hashes when running multiple Poetry installs simultaneously

See original GitHub issue

^(please note that projects are in sibling directories named project_a and project_b)

Issue

When running multiple poetry installs simultaneously with a shared Poetry cache directory, the operation commonly fails with the infamous “invalid hashes” error. This is common in CI and monorepo environments. Here’s a sample:

RuntimeError

  Invalid hashes (sha256:c7a7026632f45188f4a4548cc308c5c0683d9b8259da5cbfe0301f7527843eb4) for pandas (1.0.5) using archive pandas-1.0.5-cp36-cp36m-manylinux1_x86_64.whl. Expected one of
[omitted the other hashes for the sake of brevity]
sha256:faa42a78d1350b02a7d2f0dbe3c80791cf785663d6997891549d0f86dc49125e.

  at ~/.local/share/pypoetry/venv/lib/python3.9/site-packages/poetry/installation/executor.py:627 in _download_link
      623│                     )
      624│                 )
      625│
      626│             if archive_hashes.isdisjoint(hashes):
    → 627│                 raise RuntimeError(
      628│                     "Invalid hashes ({}) for {} using archive {}. Expected one of {}.".format(
      629│                         ", ".join(sorted(archive_hashes)),
      630│                         package,
      631│                         archive_path.name,

After receiving this error, if I run find . -name pandas-1.0.5-cp36-cp36m-manylinux1_x86_64.whl, and then run a checksum on the file, I usually get a SHA256 from the “expected” list in the error message. If I don’t get an expected hash, it seems to be related to another poetry install process that’s still running and downloading that artifact.

My current working theory is something like this:

  • Poetry install process A checks cache for an arbitrary package (say pandas, since that’s what’s in the example error above). The process get a cache miss and starts downloading pandas.
  • Poetry install process B tries to install pandas. It checks the cache and find’s A’s pandas. However, this download is incomplete, so when process B does the hash check, it’s wrong.
  • Process B fails
  • Process A finishes the download, checks the cache and checksum and succeeds.
  • I manually check the SHA256 of the file and see that it’s correct because Process A has finished and I, as a human, am inherently slower than a computer.

Is there a way we can fix this so that multiple Poetry projects with a common cache directory can safely run simultaneously on the same machine? My initial proposed solution is to simply update the download process to download artifacts directly to the system’s temp directory and only copy them into the cache once the download is complete. That way, all processes either get a cache miss, or a cache hit with a correct checksum.

Thoughts on this?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:5
  • Comments:10 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
ehiggscommented, Feb 3, 2022

This can happen if you ^C poetry when it’s installing. When you rerun it, it finds the local file in the cache and calculates the hash of the truncated file.

Need to download to a . file or .download or something and move it into place to avoid this.

My initial proposed solution is to simply update the download process to download artifacts directly to the system’s temp directory and only copy them into the cache once the download is complete.

This means copying across file systems which can be expensive. Would be best to use the same dir as where it’s being downloaded to make the ‘move into place’ ~atomic.

1reaction
zwegercommented, Aug 17, 2022

I’ve opened PR #6186 to address this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python poetry install failure - invalid hashes - Stack Overflow
There are several issue reports about invalid hashes. One common cause is running multiple Poetry instances simultaneously; ...
Read more >
FAQ | Documentation | Poetry - Python dependency ...
Thus, dependencies are resolved by pip in the first place. But afterwards we run Poetry, which will install the locked dependencies into the...
Read more >
Insights into how poetry.lock works cross platform - Packaging
From my quick skim, it runs a number of heuristics and fallbacks to sniff the metadata and essentially discover the dependencies from an...
Read more >
Pipenv: promises a lot, delivers very little | Chris Warrick
I create ~/git/website and run pipenv install Django in that directory. ... exact version and source file hash [2] of each package installed...
Read more >
black · PyPI
Black can be installed by running pip install black . ... made with PyInstaller will no longer crash when formatting several files at...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found