pip doesn't properly parse git URL if branch name contains @ or #
See original GitHub issueDescription
Trying pip install git+https://example.com/repository@branch
fails if branch
contains characters @
or #
, even percent-encoded.
Expected behavior
pip
must parse percent-encoded special characters in branch name, split the branch name from the URL, clone the repository and checkout the named branch with special characters decoded. I.e.
pip install https://example.com/repository@master%40test
must clone https://example.com/repository
and checkout master@test
branch. The same for #
character %-encoded as %23
.
pip version
Any; tested with 21.1.3
Python version
Any; tested with Python 3.9
OS
Any; tested with Debian 10 buster
How to Reproduce
Here is a test program test-pip-git
that creates a repository, tries pip download
and cleanups:
#! /bin/sh
set -e
PERCENT_ENCODING=0
while getopts p: opt; do
case $opt in
p ) PERCENT_ENCODING="${OPTARG:-1}" ;;
esac
done
shift `expr $OPTIND - 1`
if [ -z "$1" ]; then
echo "Usage: $0 [-p1|2] test_char" >&2
exit 1
fi
TEST_CHAR1="$1"
if [ $PERCENT_ENCODING -ge 1 ]; then
py_ver=`python -c "import sys; print(sys.version_info[0])"`
if [ $py_ver -eq 2 ]; then
percent_encode() {
python -c "import urllib; print(urllib.quote('$1'))"
}
elif [ $py_ver -eq 3 ]; then
percent_encode() {
python -c "import urllib.parse; print(urllib.parse.quote('$1'))"
}
else
echo "Unknown python version" >&1
exit 1
fi
TEST_CHAR2=`percent_encode "$1"`
if [ $PERCENT_ENCODING -eq 2 ]; then
TEST_CHAR2=`percent_encode "$TEST_CHAR2"`
fi
else
TEST_CHAR2="$1"
fi
rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip
git init test-pip-git-repo
cd test-pip-git-repo
echo test >test
git add test
git commit -m test
git branch -M master # to fixed name
git checkout -b test${TEST_CHAR1}test # new branch
cat >setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
name='test_pip_git_spec_char',
version='0.0.1',
description='Test pip+git+special characters',
author='Oleg Broytman',
author_email='phd@phdru.name',
keywords=['pip', 'git', '@', '!', '#', '/'],
platforms='Any',
)
EOF
git add setup.py
git commit -m setup.py
git checkout master # make test branch non-current
cd ..
pip download git+file://`pwd`/test-pip-git-repo@test${TEST_CHAR2}test | grep '\(clone\|checkout\)' || : # ignore errors
rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip
Output
./test-pip-git @
Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-v1v16zoe
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository
pip
clones incorrect repository test-pip-git-repo@master
; the repo must be test-pip-git-repo
.
./test-pip-git -p1 @
Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-__0b6wh5
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository
The same incorrect repo.
./test-pip-git \!
Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-fsnq5vt_
Running command git checkout -b 'master!test' --track 'origin/master!test'
Just a test with another less special character. pip
clones correct repository test-pip-git-repo
and checks out correct branch master!test
. Doesn’t even require %-encoding. Test passed!
./test-pip-git \#
Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-i0wy10_1
ERROR: File "setup.py" not found
pip
clones correct repository test-pip-git-repo
but doesn’t check out branch master#test
. It just uses branch master
and ignores everything after #
.
./test-pip-git -p1 \#
Exactly the same problem.
Code of Conduct
- I agree to follow the PSF Code of Conduct.
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (8 by maintainers)
Top GitHub Comments
The VCS URLs should be first parsed by
urlsplit
, and then we apply our custom parsing logic to thepath
part. The#egg=
part belongs to the fragment, not the path.Not
\
. Special characters in the URL need to be percent-encoded.