Splitting RequirementPreparer
See original GitHub issueWelcome to another edition of Pradyun dumps his thoughts in an issue to get inputs from actually smart humans!
Currently, RequirementPreparer has two “jobs” - fetch+unpack a requirement and generate metadata.
It does this using:
unpack_url
frompip._internal.download
Distribution
objects frompip._internal.distributions
There is some additional functionality that it provides. Mainly, it calls req.archive
, which well, is used in pip download
. I think there’s benefit to splitting out all these operations.
Given that InstallRequirement.archive
skips the generated pip-egg-info
directory, I think it’s reasonable to move the logic archive generation code to do so before metadata is generated.
This would result in a behavior change that I’m okay with – we’d create an archive before calling egg_info, when using pip download. I’m not sure if this affects setuptools_scm, but based on my rudimentary understanding, it shouldn’t. And if we really care a lot, I think we’d be better off moving the logic for the archiving into pip download, so that we can maintain the separation between these stages. We should probably be doing that anyway since in the more-correct resolver model, we’d only want to archive whatever is in the final set. Anyway, I’m calling it “not required” right now so I won’t be making that change.
With that change, all the fetching related logic would happen before metadata generation. That’ll allow splitting RequirementPreparer
into RequirementFetcher
and MetadataGenerator
s. This in turn would make it so that we can also introduce abstraction-style objects between these stages if we want to. I’m open to exploring that based on how the refactor here goes.
My understanding is that we can get away with making the MetadataGenerator to be just functions. In future, we could make them transform some kind of FetchedCandidate into a Distribution object. For now, they’ll consume an InstallRequirement, do whatever we’re doing today and return the same object. This change also confuses me what we’d want to be doing with Distribution objects (they have the build isolation code and call the metadata generation code in InstallRequirement) but, hey, one step at a time. I’ll look into that once this “stage” is done with.
Issue Analytics
- State:
- Created 4 years ago
- Comments:14 (14 by maintainers)
Top GitHub Comments
We should probably get PEP 658 in first, but that might not be trivial either with the current structure. Not quite sure, I only did a very brief investigation a while ago and may very well have missed obvious approaches.
Based on https://github.com/pypa/pip/issues/11447 and https://github.com/pypa/pip/issues/8670, my guess is that it’s still useful – if a little poorly implemented.
I do think that it’s worth exploring whether we should get rid of it once PEP 658 is implemented on PyPI though.