question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FEATURE] Better and more complete parsing for `requirements.txt` files

See original GitHub issue

We currently utilise pkg_resources.parse_requirements to parse lines from requirements.txt files.

This seemingly has a number of short comings that it does not support. See:

  • CycloneDX/cyclonedx-python#315
  • CycloneDX/cyclonedx-python-lib#8

This feature will look at alternatives to the above method to attempt to support these other formats. CycloneDX/cyclonedx-python-lib#97 may have identified a candidate in requirements-parser - TBC.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jkowalleckcommented, Dec 20, 2021

we might need to see what pip install -r uses as a parser. this is the parser most people know and the reason why people expect the same features here, too.

unfortunately the parser is internal: pip._internal.req.parse_requirements - see https://github.com/pypa/pip/blob/main/src/pip/_internal/req/__init__.py see further: https://github.com/pypa/pip/blob/main/src/pip/_internal/req/req_file.py


The whole topic seams to be an issue, because people dont read properly and confuse our requirements.txt capabilities with the one they know from some project they use without knowing what they are actually doing. We should have cyclonedx-python-lib’s requirements-parser comply to PEP508 - as the readme tells - and that is it. everything else can be implemented by volunteer contributors, if they need additional features.

0reactions
CasperGNcommented, Aug 19, 2022

@jkowalleck thanks for quick return!

It’s my understanding - which may be incorrect - that PEP508 allows for requirements.txt to only supply the name of the dependency, that’s at least what I gather from the abstract: “… The job of a dependency is to enable tools like pip [1] to find the right package to install. Sometimes this is very loose - just specifying a name, and sometimes very specific - referring to a specific file to install…”.

My goal is to cover all use cases for the SBOM generation and hence I’d need to be able to support all variations of dependencies and formats for describing these. I did test out the above change in requirements.py:

for requirement in parsed_rf.requirements:
            name = requirement.link.url if requirement.is_local_path else requirement.name
            version = requirement.get_pinned_version or self.no_version_handler(requirement=requirement, pkg_name=name)
...
    def no_version_handler(self, pkg_name: str, requirement: InstallRequirement) -> str:
        try:
            pkg = require(pkg_name)[0]
        except DistributionNotFound:
            print('Running -r with no pinned versions require dependencies to be installed. Run: `pip install -r requirements.txt` before running cyclonedx-bom.')
            exit(1)
        try:
            lines = pkg.get_metadata_lines('METADATA')
        except FileNotFoundError:
            lines = pkg.get_metadata_lines('PKG-INFO')
        for line in lines:
            if line.startswith('License:'):
                return line[9:]
        return str(requirement.dumps_specifier())

Which yields the below bom

<?xml version="1.0" encoding="UTF-8"?>
<bom xmlns="http://cyclonedx.org/schema/bom/1.4" version="1" serialNumber="urn:uuid:0375246a-04da-4488-8fb3-bb0d77c49247">
    <metadata>
...
    </metadata>
    <components>
        <component type="library" bom-ref="f9233fb0-7c5f-4b53-a13f-1a3a95802229">
            <name>Cerberus</name>
            <version>ISC</version>
            <purl>pkg:pypi/cerberus@ISC</purl>
        </component>
        <component type="library" bom-ref="439355ab-955c-4850-9093-e33d50ba5521">
            <name>Jinja2</name>
            <version>BSD-3-Clause</version>
            <purl>pkg:pypi/jinja2@BSD-3-Clause</purl>
        </component>
        <component type="library" bom-ref="0dc1d956-35ef-4c46-9184-ce53de30e1ae">
            <name>PyYAML</name>
            <version>MIT</version>
            <purl>pkg:pypi/pyyaml@MIT</purl>
        </component>
        <component type="library" bom-ref="7ee47e91-cd0a-413b-b3f3-3fd1ae1f270c">
            <name>graphviz</name>
            <version>MIT</version>
            <purl>pkg:pypi/graphviz@MIT</purl>
        </component>
    </components>
    <dependencies>
        <dependency ref="f9233fb0-7c5f-4b53-a13f-1a3a95802229" />
        <dependency ref="439355ab-955c-4850-9093-e33d50ba5521" />
        <dependency ref="0dc1d956-35ef-4c46-9184-ce53de30e1ae" />
        <dependency ref="7ee47e91-cd0a-413b-b3f3-3fd1ae1f270c" />
    </dependencies>
</bom>

Would this still be considered breaking changes?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Create Your requirements.txt Using This Technique
The most common technique to create a requirements.txt file is to run pip freeze > requirements.txt when all packages are already installed.
Read more >
likely error parsing requirements.txt · Issue #381 ... - GitHub
Obtaining file:///tmp/dftimewolf ERROR: Command errored out with exit status 1: command: /tmp/venv/bin/python3 -c 'import sys, setuptools, ...
Read more >
The Python Requirements File and How to Create it
It is a simple text file that saves a list of the modules and packages required by your project. By creating a Python...
Read more >
A Better Practice for Managing Many extras_require ...
This allows one to later cherry-pick the feature via pip install ... a package-to-feature map in a separate plain text file, then parse...
Read more >
Proper way to parse requirements file after pip upgrade to pip ...
First, I believe parsing requirements.txt from within setup.py is not a good idea. It should be the other way around, install_requires in ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found