question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Introduce support for URLs pointing to python packages hosted in cloud provider storage buckets

See original GitHub issue

What’s the problem this feature will solve?

Currently, in order to securely and conveniently store private python packages, I compile my packages into archives (.tar.gz files) and push them up to s3. I’ve written a little command line tool to help me with publishing and downloading packages from s3.

The problem is I end up managing my dependencies in 2 places. In a file I call s3pypkg.txt and in requirements.txt since pip is unable to download and install these packages for me. It’s kind of messy, because if you don’t know about it it can be confusing. And there is always multiple steps to downloading and installing dependencies when building and deploying services.

This issue proposes supporting s3:// and gs:// urls directly in pip.

Describe the solution you’d like

The solution I’d like is support to host an entire pypi package repo inside of a s3 or gs bucket where it can be secured with cloud provider authentication and accessed using the cloud provider provided tools. This change is fairly invasive and will probably take a good amount of deliberation and planning to achieve. Questions like “How is the bucket structure laid out?”, “How to achieve efficient indexing?”, “Can we use native libraries or cloud service REST APIs to achieve this or do we dispatch commands out to a command line tool?”.

Alternative Solutions

Another solution that I’ve been toying with is supporting s3 or gs hosted python packages like alternative forms of urls (similar to http:// hosted packages) then simply selecting the appropriate downloader when fetching the package. This likely will only support source code archives and won’t include good version resolution, but it at least pilots the idea and does actually provide a short-term workable solution.

Additional context

Code of Conduct

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
pradyunsgcommented, Jan 14, 2022

Is it not possible to have an intermediate server, to handle the translation from PEP 503-style pages and S3? The open design questions can then be solved in that server, which can evolve independently from pip. I don’t know if something like this exists in a public codebase, but it likely does in things like Artifactory.

I don’t think this is common enough to push complexity for this into pip, neither would it be the right place to push complexity to.

I’m certainly not too comfortable with adding features in pip that are primarily for folks in corporate spaces — both, because pip’s maintainers have limited visibility into such environments when things go wrong and because I don’t want to spend my free time maintaining something that primarily serves to externalise costs for businesses; since so far, we haven’t see them do much to help with pip’s development.

1reaction
pfmoorecommented, Jan 13, 2022

You can of course add security to the server. Simpleserver may not provide that, but it’s certainly possible to write your own version which does. Off the shelf solutions like devpi and artifactory are also available.

Overall, I’m -0.5 on this change. I’m uncomfortable about the maintenance implications (for a start, how would we test this feature?) and I think the requirement is pretty rare. I think I’d need to see a much better justification for this feature than the simple statement that you would find it useful, and you use s3 “in order to securely and conveniently store private python packages”. Many organisations securely store private packages and we’ve never had a request for s3 support from them, which suggests that other solutions tend to be used.

On the other hand, if we did add support for this, I’d likely just ignore it (I don’t need it myself, and I don’t have to work on any issues related to it) so I won’t object if another pip developer wants to approve this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Request endpoints | Cloud Storage - Google Cloud
This page explains the different request endpoints (URIs) you can use to access Cloud Storage. Cloud Storage supports HTTP/1.1, HTTP/2, and HTTP/3 protocols....
Read more >
PyPI packages in the Package Registry - GitLab Docs
Supported CLI commands. The GitLab PyPI repository supports the following CLI commands: twine upload : Upload a package to the registry. pip install ......
Read more >
What is Amazon S3? - Amazon Simple Storage Service
Store data in the cloud and learn the core concepts of buckets and objects with the Amazon S3 web service.
Read more >
Python developer reference for Azure Functions
Functions doesn't currently support local Python function development ... packages the system installs when publishing to Azure. host.json: ...
Read more >
Containerization Explained - IBM
Other container layers, like common bins and libraries, can also be shared among ... Cloud service providers (CSPs) manage the underlying ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found