Stop vendoring packages
See original GitHub issueThe current implementation of Setuptools has its dependencies (separately for setuptools
and pkg_resources
) vendored into the package. The project implemented this approach because of the bootstrapping problem.
Bootstrapping Problem
Setuptools extended distutils, which was designed to sit at the root of a packaging system and presumed to be present and without any dependencies. Over time, Setuptools superseded distutils but inherited these constraints. In particular, when a system integrator wishes to build a system from sources, it requires a directed acyclic graph (DAG) of both build-time and runtime dependencies. As a result, Setuptools cannot depend on any package that uses Setuptools to build (or whose dependencies require Setuptools to build). This bootstrapping problem includes Setuptools itself, as it requires itself to build.
Vendoring
As the ecosystem grew more complex, with standards-based implementations (such as packaging) appearing as third-party packages and not available in the standard library, Setuptools found itself requiring dependencies, but because of the bootstrapping problem, Setuptools adopted a vendoring strategy (copying the packages directly into the project) as a means of requiring that functionality.
However, this approach creates constraints and complications to the project:
- Not all packages can be vendored.
- If the other package has vendored dependencies, those may not work in a vendored context.
- Some packages have global state or modify global state or have interfaces that are reliant on the package layout (incl. pkg_resources, importlib_metadata, packaging), leading to unexpected failures when loaded in a global context.
- Refactoring functionality out of the library is difficult if not impossible due to the constraints above. In particular, this project would like to move
pkg_resources
into a separate package, but even thoughpkg_resources
has no dependency onsetuptools
to run, it still must vendor its own dependencies (is this true?). - When vendoring a package, it often is required to be rewritten to accommodate vendoring. Any absolute imports must be replaced by relative imports.
- Because vendoring is a second-class approach to dependency management (and unsustainable in the general case), it often requires specialized tooling to manage the dependencies and this management can often fall in conflict with the first-class tools.
- When vendoring dependencies, it’s the responsibility of the hosting package to re-write imports to point to the vendored copies, creating non-standard usage with sometimes unclear semantics.
- Due to the constraints above, adding a new dependency can be an onerous process, requiring extra care and testing that may break in downstream environments whose workflows aren’t proven.
- Because vendored dependencies are in fact de-facto satisfied, the project cannot and should not declare those dependencies as other projects do. Therefore, it’s not possible to inspect the dependencies readily as one would with standard declarations.
- Because vendored dependencies are effectively pinned, they create impedance and require manual intervention or extra, non-standard tooling to manage the evolution of those dependencies, defaulting to a practice of erosion.
- Because they’re pinned, it’s more difficult to discern if a particular dependency is pinned for a known good reason or simply because of the vendoring.
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (13 by maintainers)
Top GitHub Comments
Thanks for the explanation. If the end goal is for setuptools to host the schemas, then let’s focus on the second option indeed. It’d still be nice to able to easily regen the resulting data but I don’t think that’s a priority, i.e. something to put on the “far TODO”.
I don’t think we can do this yet.
Not until pip removes the fallbacks to direct setup.py invocation (https://pip.pypa.io/en/stable/reference/build-system/setup-py/ – in-progress) and the separation of pkg_resources is completed. Without those, I think this would exceed our available churn budget. 😃
setuptools is installed with pip, by all supported mechanisms to install pip – https://github.com/pypa/pip/issues/10530#issuecomment-932937829. IMO this should be considered a pre-condition for doing this, and it’s worthwhile spending a few months or so publicising this change before we actually make it.
FWIW, pip’s solution for this is something I maintain separately: https://github.com/pradyunsg/vendoring (I’ll consistently use
vendoring
to refer to this project in the rest of this post). I’m happy to accomodate for setuptools in that. That should help alleviate many of the pain points with vendoring dependencies.If you’re curious about how the tool works, my suggestion is to clone pip and run
tox -e vendoring
/nox -s vendoring
and look at the vendor.txt, tools/vendoring/patches/*, and pyproject.toml files in pip’s source tree.vendoring
provides automated import rewriting, with an explicit error on import styles that can’t be re-written.__import__
– in these cases, the dynamic import logic needs to be patched manually.==
pins, for obvious reasons) serve as the source of truth for this alleviates the lack of transparency.I think these constraints are non-existent / workable, if you adopt
vendoring
. 😃This is somewhat true – you still have the caveat of needing to manage evolution, but you can use standard tooling (like dependabot) to manage the upgrading, if you go down the
vendoring
route. It’s also possible to have separate unpinned/pinned dependency declaration sets (ALA pip-compile’s workflow). You do need to ensure that the entire dependency tree is included and pinned, for usingvendoring
.Basically, the moment you start vendoring stuff, you need to start thinking of the project as being managed like an application with pinned dependencies – all the corresponding dependency management constraints apply (except you run
vendoring sync .
instead ofpip install -r [blah blah]
to “install” the dependencies).If you adopt
vendoring
, this is not true.It is possible to include comments in input file to the tool (it’s basically a requirements.txt file, that’s consumed by pip), which can be useful to describe this nuance.
Yep, and… it shouldn’t be too difficult if you use vendoring to bring that package into setuptools.
True. Anything that’s non-pure-Python or provides an importable API for plugins is non-vendorable.
Generally, not true; based on my experience with
vendoring
. It is possible to rewrite their imports seemlessly.I’m not quite sure what you mean here –
setuptools.extern.packaging.version.Version
is always going to compare not equal topip._vendor.packaging.version.Version
because they’re different packages (and possibly different versions of thepackaging
too!). If that’s what you’re referring to, yea… but I don’t see this as being a big problem.The only case this can be an issue is if you expect this difference to exist/not exist in some code – since that’d be fragile. It’s often really straightforward to avoid that though.