question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pip doesn't properly parse git URL if branch name contains @ or #

See original GitHub issue

Description

Trying pip install git+https://example.com/repository@branch fails if branch contains characters @ or #, even percent-encoded.

Expected behavior

pip must parse percent-encoded special characters in branch name, split the branch name from the URL, clone the repository and checkout the named branch with special characters decoded. I.e.

pip install https://example.com/repository@master%40test

must clone https://example.com/repository and checkout master@test branch. The same for # character %-encoded as %23.

pip version

Any; tested with 21.1.3

Python version

Any; tested with Python 3.9

OS

Any; tested with Debian 10 buster

How to Reproduce

Here is a test program test-pip-git that creates a repository, tries pip download and cleanups:

#! /bin/sh
set -e

PERCENT_ENCODING=0
while getopts p: opt; do
    case $opt in
        p ) PERCENT_ENCODING="${OPTARG:-1}" ;;
    esac
done
shift `expr $OPTIND - 1`

if [ -z "$1" ]; then
    echo "Usage: $0 [-p1|2] test_char" >&2
    exit 1
fi

TEST_CHAR1="$1"
if [ $PERCENT_ENCODING -ge 1 ]; then

    py_ver=`python -c "import sys; print(sys.version_info[0])"`
    if [ $py_ver -eq 2 ]; then
        percent_encode() {
            python -c "import urllib; print(urllib.quote('$1'))"
        }
    elif [ $py_ver -eq 3 ]; then
        percent_encode() {
            python -c "import urllib.parse; print(urllib.parse.quote('$1'))"
        }
    else
        echo "Unknown python version" >&1
        exit 1
    fi
    TEST_CHAR2=`percent_encode "$1"`
    if [ $PERCENT_ENCODING -eq 2 ]; then
        TEST_CHAR2=`percent_encode "$TEST_CHAR2"`
    fi
else
    TEST_CHAR2="$1"
fi

rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip
git init test-pip-git-repo
cd test-pip-git-repo

echo test >test
git add test
git commit -m test

git branch -M master # to fixed name
git checkout -b test${TEST_CHAR1}test # new branch

cat >setup.py <<EOF
#!/usr/bin/env python

from setuptools import setup

setup(
    name='test_pip_git_spec_char',
    version='0.0.1',
    description='Test pip+git+special characters',
    author='Oleg Broytman',
    author_email='phd@phdru.name',
    keywords=['pip', 'git', '@', '!', '#', '/'],
    platforms='Any',
)
EOF

git add setup.py
git commit -m setup.py
git checkout master # make test branch non-current

cd ..
pip download git+file://`pwd`/test-pip-git-repo@test${TEST_CHAR2}test | grep '\(clone\|checkout\)' || : # ignore errors

rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip

Output

./test-pip-git @

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-v1v16zoe
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository

pip clones incorrect repository test-pip-git-repo@master; the repo must be test-pip-git-repo.

./test-pip-git -p1 @

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-__0b6wh5
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository

The same incorrect repo.

./test-pip-git \!

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-fsnq5vt_
Running command git checkout -b 'master!test' --track 'origin/master!test'

Just a test with another less special character. pip clones correct repository test-pip-git-repo and checks out correct branch master!test. Doesn’t even require %-encoding. Test passed!

./test-pip-git \#

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-i0wy10_1
ERROR: File "setup.py" not found

pip clones correct repository test-pip-git-repo but doesn’t check out branch master#test. It just uses branch master and ignores everything after #.

./test-pip-git -p1 \#

Exactly the same problem.

Code of Conduct

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:15 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
uranusjrcommented, Oct 7, 2021

The VCS URLs should be first parsed by urlsplit, and then we apply our custom parsing logic to the path part. The #egg= part belongs to the fragment, not the path.

>>> from urllib.parse import urlsplit
>>> urlsplit('https://example.com/%40uranusjr/pkg@dev#egg=myproj')
SplitResult(scheme='https', netloc='example.com', path='/%40uranusjr/pkg@dev', query='', fragment='egg=myproj')
1reaction
uranusjrcommented, Oct 4, 2021

Not \. Special characters in the URL need to be percent-encoded.

Read more comments on GitHub >

github_iconTop Results From Across the Web

PIP install error with git repository packages - Stack Overflow
I thinks the problem is with git urls in the requirements.txt . Please check the following requirements.txt and let me know what to...
Read more >
git-tag Documentation - Git
Often, "please pull" messages on the mailing list just provide two pieces of information: a repo URL and a branch name; this is...
Read more >
'pip install' From a Git Repository - Adam Johnson
If the package is pure Python or has a relatively simple build process, you can normally install it directly via Git.
Read more >
pip Documentation - Read the Docs
This is useful if the target machine does not have a ... Requirements files are used to force pip to properly resolve dependencies....
Read more >
Troubleshooting git-remote-codecommit and AWS CodeCommit
Push or pull error: I cannot push or pull commits from an IDE to a CodeCommit repository ... Problem: When you try to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found