question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pip doesn't support relative paths in direct URL references

See original GitHub issue

Environment

  • pip version: 19.1.1
  • Python version: 3.7
  • OS: MacOS Mojave

Description I’m not sure if this is a six bug or a Pip bug. Excuse me if it belongs to six.

Pip seems to allow local paths in install_requires via name @ ./some/path, but the URL parsing is terribly broken.

https://github.com/pypa/pip/blob/a38a0eacd4650a7d9af4e6831a2c0aed0b6a0329/src/pip/_internal/download.py#L670-L690

In this function, it uses urlsplit to get the individual components of the incoming URL.

Here’s what that looks like with a few pieces of input:

./foo/bar -> SplitResult(scheme='', netloc='', path='./foo/bar', query='', fragment='')
file:./foo/bar -> SplitResult(scheme='file', netloc='', path='./foo/bar', query='', fragment='')
file://./foo/bar -> SplitResult(scheme='file', netloc='.', path='/foo/bar', query='', fragment='')

Notice the last one results in the netloc being . instead of empty and the path being absolute, not local. This trips the error regarding non-local paths. That’s all fine and well - I can use the second form to satisfy the conditional logic (though it really ought to support the first as well).

However, there’s conflicting logic elsewhere…

https://github.com/pypa/pip/blob/169eebdb6e36a31e545804228cad94902a1ec8e9/src/pip/_vendor/packaging/requirements.py#L103-L106

This is the logic that fails even though we satisfy the prior logic.

Here’s a test function that shows the problem:

from six.moves.urllib import parse as urllib_parse

def tryparse(url):
    print(url)
    parsed = urllib_parse.urlparse(url)
    unparsed = urllib_parse.urlunparse(parsed)
    parsed_again = urllib_parse.urlparse(unparsed)
    print(parsed)
    print(unparsed)
    print(parsed_again)

Here’s the output for ./foo/bar:

>>> tryparse('./foo/bar')
./foo/bar
ParseResult(scheme='', netloc='', path='./foo/bar', params='', query='', fragment='')
./foo/bar
ParseResult(scheme='', netloc='', path='./foo/bar', params='', query='', fragment='')

All good, though it doesn’t satisfy the first function’s logic of requiring a scheme of file:.

Here’s the output for file:./foo/bar:

>>> tryparse('file:./foo/bar')
file:./foo/bar
ParseResult(scheme='file', netloc='', path='./foo/bar', params='', query='', fragment='')
file:///./foo/bar
ParseResult(scheme='file', netloc='', path='/./foo/bar', params='', query='', fragment='')

Oops! Notice how, when we “unparse” the result from the first parse call, our path becomes absolute file:///....

This is why the second mentioned check fails - the path is not local. I believe this to be a bug in six but can be mitigated in Pip by allowing scheme in ['file', ''] and instructing users to use the ./foo/bar URI form.

Given these two contradictory pieces of logic, it’s impossible to use local paths in install_requires keys in either distutils or setuptools configurations.

Expected behavior I should be able to do name @ ./some/path (or, honestly, simply ./some/path) to specify a vendored package local to my codebase.

How to Reproduce

#!/usr/bin/env bash
mkdir /tmp/pip-uri-repro && cd /tmp/pip-uri-repro

mkdir -p foo/bar

cat > requirements.txt <<EOF
./foo
EOF

cat > foo/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="foo",
    version="0.1",
    install_requires=[
        "bar @ file:./bar"
    ]
)
EOF

cat > foo/bar/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="bar",
    version="0.1"
)
EOF

# (OUTPUT 1)
pip install -r requirements.txt

cat > foo/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="foo",
    version="0.1",
    install_requires=[
        # we're forced to use an absolute path
        # to make the "Invalid URL" error go
        # away, which isn't right anyway (the
        # error that is raised as a result
        # is justified)
        "bar @ file://./bar"
    ]
)
EOF

# (OUTPUT 2)
pip install -r requirements.txt

Output

From the first pip install:

Processing ./foo
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: error in foo setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Invalid URL given

From the second pip install:

Processing ./foo
ERROR: Exception:
Traceback (most recent call last):
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 178, in main
    status = self.run(options, args)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 352, in run
    resolver.resolve(requirement_set)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 131, in resolve
    self._resolve_one(requirement_set, req)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 294, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 242, in _get_abstract_dist_for
    self.require_hashes
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 256, in prepare_linked_requirement
    path = url_to_path(req.link.url)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/download.py", line 521, in url_to_path
    % url
ValueError: non-local file URIs are not supported on this platform: 'file://./bar'

EDIT:

Just found out that RFC 3986 specifies that relative path URIs are not permitted with the file: scheme, so technically six should be erroring out on file:./foo/bar.

However, that means, technically, I should be able to do the following in my setup.py:

PKG_DIR = os.path.dirname(os.path.abspath(__file__))
install_requires = [
    f"name @ file://{PKG_DIR}/foo/bar"
]

However, pip seems to be creating a “clean” copy of the package in /tmp, so we get something like file:///tmp/pip-req-build-9u3z545j/foo/bar.

Running that through our test function, we satisfy the second function’s conditional:

>>> tryparse('file:///tmp/pip-req-build-9u3z545j/foo/bar')
file:///tmp/pip-req-build-9u3z545j/foo/bar
ParseResult(scheme='file', netloc='', path='/tmp/pip-req-build-9u3z545j/foo/bar', params='', query='', fragment='')
file:///tmp/pip-req-build-9u3z545j/foo/bar
ParseResult(scheme='file', netloc='', path='/tmp/pip-req-build-9u3z545j/foo/bar', params='', query='', fragment='')

Everything is good there. The “unparse” yields the same result, and the netloc requirements are met for the first function’s conditional.

However, we’re still met with an Invalid URL error, even though the second function’s logic is satisfied.

Since pip (or distutils or setuptools or whatever) swallows output, I went ahead and did the following in my setup.py

import os
PKG_DIR = os.path.dirname(os.path.abspath(__file__))
assert False, os.system(f"find {PKG_DIR}")

Which verifies that all of the files are there, as expected - so it can’t be a file missing or something. The line above that has "Invalid URL given" is the only place in the codebase that string shows up.

At this point, I’m not sure what the problem is.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:21
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

26reactions
Qix-commented, Jun 28, 2019

Okay, I see the problem. setuptools, pkg-resources and pip all use slightly different versions of the packaging library.

In pip, it’s the version I showed above.

However, in everything else, it’s the following (I’m not sure which is the “newer” one, but the following logic is very limiting and not fully compliant as per RFC 3986 as file:/// should be allowed, implying an empty netloc):

        if req.url:
            parsed_url = urlparse.urlparse(req.url)
            if not (parsed_url.scheme and parsed_url.netloc) or (
                    not parsed_url.scheme and not parsed_url.netloc):
                raise InvalidRequirement("Invalid URL given")

🙄

That means since my filepath has file:///foo/bar and not file://localhost/foo/bar then it fails.

Here is the complete solution:

import os
from setuptools import setup

PKG_DIR = os.path.dirname(os.path.abspath(__file__))

setup(
    install_requires=[
        f'foo @ file://localhost{PKG_DIR}/foo/bar'
    ]
)

This is pretty bad UX mixed in with ambiguous and time-wasting errors.

How can we improve this situation?

4reactions
hackermdcommented, May 15, 2020

What is the status of this issue? A solution for resolving local dependencies that are not on PyPI is urgently needed, for example in the context of monolithic repositories.

Note that npm has implemented this feature in a similar way and dependencies can be specified in package.json using a local path.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is the correct interpretation of path-based PEP 508 ...
In order for pip to support this, we need to decide how to interpret a path given in the URL reference. I see...
Read more >
use a relative path in requirements.txt to install a tar.gz file with ...
In the current version of pip (1.2.1) the way relative paths in a requirements file are interpreted is ambiguous and semi-broken.
Read more >
Simple trick to work with relative paths in Python | by Mike Huls
The goal of this article is to calculate a path to a file in a folder in your project. The reason we calculate...
Read more >
Changelog - pip documentation v22.3.1
The record is named origin.json and uses the PEP 610 Direct URL format. (#11137) ... and the build backend does not support PEP...
Read more >
HTML File Paths - W3Schools
When using relative file paths, your web pages will not be bound to your current base URL. All links will work on your...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found