question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Recursive search for egg-info directories breaks nested projects

See original GitHub issue

Issue description

During locking, when the resolver subprocess looks for a name for a requirement, and the requirement is a local path, then it (via requirementslib) will recursively search for an .egg-info directory and get the name from there. But the .egg-info directory might be unrelated to the package to install. In particular, a pathological case occurs when:

  • A Pipfile lists multiple packages, one nested inside the other
  • The inner one is processed first
  • Processing the inner one generates an .egg-info directory (because setup.py egg_info is invoked)

In this case, the resolver process ends up with the same name for the inner and outer package, and things get badly confused. See “Steps to Reproduce” below.

Expected result

Pipfile.lock has the correct path for each project

Actual result

Pipfile.lock is corrupted and confuses the outer and inner packages

    "default": {
        "french-toast": {
            "editable": true,
            "path": "."
        },
        "scrambled-eggs": {
            "editable": "true",
            "path": "."
        }
    },

Steps to replicate

  1. Clone the repo https://github.com/owtaylor/scrambled_eggs. It has following structure
scrambled_eggs/
   Pipfile
   setup.py
   scrambled_eggs/
      __init__.py
   french_toast/
       setup.py
       french_toast/
            __init__.py

and the following Pipfile

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[packages]
french_toast = {path = "./french_toast", editable = "true"}
scrambled_eggs = {path = ".", editable = "true"}
  1. Make sure that there are no existing .egg-info directories from previous runs (git clean -dxf)
  2. Run pipenv lock and observe the Pipfile.lock generated

NOTE: the order of requirement processing is arbitrary - it may be necessary to force it by editing https://github.com/pypa/pipenv/blob/main/pipenv/utils/resolver.py#L152

*WORKAROUND: Run python setup.py egg_info at the toplevel before running pipenv lock. This will create an egg_info directory for the outer package, which will be found first.

I’m not really sure what the right fix here is - requirementslib could not scan recursively for .egg-info directories and only look for directo subdirectories of the given path - but there’s probably some reason that the recursive search was added?


Pipenv version: '2022.5.2'

Pipenv location: '/var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/lib/python3.10/site-packages/pipenv'

Python location: '/var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin/python'

Python installations found:

  • 3.10.4: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin/python
  • 3.10.4: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin/python3
  • 3.10.4: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin/python
  • 3.10.4: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin/python3
  • 3.10.4: /usr/bin/python3
  • 3.10.4: /bin/python3
  • 3.9.12: /usr/bin/python3.9
  • 3.9.12: /usr/bin/pypy3.9
  • 3.9.12: /usr/bin/pypy3
  • 3.9.12: /bin/python3.9
  • 3.9.12: /bin/pypy3.9
  • 3.9.12: /bin/pypy3
  • 3.8.13: /usr/bin/python3.8
  • 3.8.13: /usr/bin/pypy3.8
  • 3.8.13: /bin/python3.8
  • 3.8.13: /bin/pypy3.8
  • 3.7.13: /usr/bin/python3.7
  • 3.7.13: /usr/bin/python3.7m
  • 3.7.13: /bin/python3.7
  • 3.7.13: /bin/python3.7m
  • 3.6.15: /usr/bin/python3.6
  • 3.6.15: /usr/bin/python3.6m
  • 3.6.15: /bin/python3.6
  • 3.6.15: /bin/python3.6m
  • 2.7.18: /usr/bin/python2
  • 2.7.18: /usr/bin/python2.7
  • 2.7.18: /usr/bin/pypy2
  • 2.7.18: /usr/bin/pypy
  • 2.7.18: /usr/bin/pypy2.7
  • 2.7.18: /bin/python2
  • 2.7.18: /bin/python2.7
  • 2.7.18: /bin/pypy2
  • 2.7.18: /bin/pypy
  • 2.7.18: /bin/pypy2.7

PEP 508 Information:

{'implementation_name': 'cpython',
 'implementation_version': '3.10.4',
 'os_name': 'posix',
 'platform_machine': 'x86_64',
 'platform_python_implementation': 'CPython',
 'platform_release': '5.15.8-200.fc35.x86_64',
 'platform_system': 'Linux',
 'platform_version': '#1 SMP Tue Dec 14 14:26:01 UTC 2021',
 'python_full_version': '3.10.4',
 'python_version': '3.10',
 'sys_platform': 'linux'}

System environment variables:

  • SHELL
  • COLORTERM
  • HISTCONTROL
  • XDG_MENU_PREFIX
  • PIPENV_ACTIVE
  • HOSTNAME
  • HISTSIZE
  • SSH_AUTH_SOCK
  • DISTTAG
  • DESKTOP_SESSION
  • EDITOR
  • NAME
  • PWD
  • XDG_SESSION_DESKTOP
  • LOGNAME
  • XDG_SESSION_TYPE
  • TOOLBOX_PATH
  • GNOME_TERMINAL_FEATURES
  • XAUTHORITY
  • container
  • TOOLBOX_CONTAINER
  • PIP_PYTHON_PATH
  • HOME
  • LANG
  • LS_COLORS
  • XDG_CURRENT_DESKTOP
  • FGC
  • VIRTUAL_ENV
  • VTE_VERSION
  • WAYLAND_DISPLAY
  • GNOME_TERMINAL_SCREEN
  • ANSIBLE_NOCOWS
  • TERM
  • LESSOPEN
  • USER
  • PIP_DISABLE_PIP_VERSION_CHECK
  • GNOME_TERMINAL_SERVICE
  • DISPLAY
  • SHLVL
  • PYTHONDONTWRITEBYTECODE
  • XDG_RUNTIME_DIR
  • PS1
  • DEBUGINFOD_URLS
  • which_declare
  • XDG_DATA_DIRS
  • PATH
  • VERSION
  • DBUS_SESSION_BUS_ADDRESS
  • MAIL
  • OLDPWD
  • BASH_FUNC_which%%
  • _
  • PIP_SHIMS_BASE_MODULE
  • PYTHONFINDER_IGNORE_UNSUPPORTED

Pipenv–specific environment variables:

  • PIPENV_ACTIVE: 1

Debug–specific environment variables:

  • PATH: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK/bin:/var/home/otaylor/.local/bin:/var/home/otaylor/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  • SHELL: /bin/bash
  • EDITOR: /usr/bin/vim
  • LANG: C.UTF-8
  • PWD: /var/home/otaylor/Source/scrambled_eggs
  • VIRTUAL_ENV: /var/home/otaylor/.local/share/virtualenvs/scrambled_eggs-mbSpAmTK

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:8

github_iconTop GitHub Comments

2reactions
owtaylorcommented, Jun 6, 2022

The heart of the first bug is requirementslib.Requirement.__parse_name_from_path

    def _parse_name_from_path(self):
        # type: () -> Optional[S]
        # OT: is_installable_dir basically checks whether pyproject.toml or setup.py exists
        if self.path and self.is_local and is_installable_dir(self.path):
            metadata = get_metadata(self.path)
            if metadata:
                name = metadata.get("name", "")
                if name and name != "wheel":
                    return name
            parsed_setup_cfg = self.parsed_setup_cfg
            if parsed_setup_cfg:
                name = parsed_setup_cfg.get("name", "")
                if name:
                    return name

            parsed_setup_py = self.parsed_setup_py
            if parsed_setup_py:
                name = parsed_setup_py.get("name", "")
                if name and isinstance(name, str):
                    return name
        return None
  • Why is the getmetadata() path there at all? Is it optimization? Is it to handle cases where the buildsystem is something unknown and doesn’t have a setup.cfg or setup.py? But why would we think there would necessarily be an existing .egg-info in that case? I think to handle the no-setup.py no-setup.cfg properly requirementslib would have to use prepare_metadata_for_build_wheel
  • If we change this by a) moving get_metadata() to the end b) not calling get_metadata() if self.editable, c) never calling get_metadata() for this - is there going to be the same problem with the Requirement.metadata property? (Not obviously used by pipenv)
  • Why does getmetadata() search recursively? Is this for the case of having package_dir = {'': 'lib'} in setup.py (which would relocate the egg-info to the lib subdir)

The other route would be to try to get the name passed in when creating the Requirement() object - the basic problem here is that when resolving dependencies we go from ‘scrambled_eggs = {path = “.”, editable = “true”}’ to ‘-e .’ and then if the resolution process figures out a name for . other than scrambled_eggs, everything gets confused. Certainly at least checking that names are the same would improve resilience. (Entirely relying it would raise questions about what happens if there’s a typo in a package name in the Pipfile?)

1reaction
owtaylorcommented, Jun 6, 2022

Hi - while I’ve been using Python for 20+ years, and want to help the ecosystem, I don’t have the capacity to become involved in buildsystem hacking. Many thanks to you and others who work on this! My interest here, basically, is that I fell into a hole and spent a day trying to get out of it, and don’t want to leave it there for others to fall into 😉

My current understanding of this issue is that it really is two issues that are combining in a hard-to-understand way:

  1. A long-standing issue where, when requirementslib is trying to find the name of a {path="x", editable="true"} requirement it recursively searches for .egg-info directories and uses the information in preference to finding the name out from setup.py / setup.cfg.
  2. A regression in the last year where pipenv has started leaving behind .egg-info directories when locking {path="x", editable="true"} dependencies. (Which seems to be because the pep517 code paths in requirementslib were broken a year ago, and when they were fixed started doing this.)

For 2., I’ll file a separate bug, but don’t want to work on fixing it - it seems quite hard with the PEP517 abstraction involved - though it might be easy for someone who understands it. The first issue is more obviously tractable, and I can at least try to propose a partial solution, if you are willing to discuss possible changes with me.

Read more comments on GitHub >

github_iconTop Results From Across the Web

The Internal Structure of Python Eggs - Setuptools
.egg format: a directory or zipfile containing the project's code and resources, along with an EGG-INFO subdirectory that contains the project's metadata.
Read more >
Python packages and egg-info directories - Stack Overflow
I'm assuming the egg-info directory is to make the corresponding module visible to setuptools (easy_install), right? If so, how does setuptools ...
Read more >
formats.txt - Python.org
egg -info`` format: a file or directory placed *adjacent* to the project's code and resources, that directly contains the project's metadata. Both formats...
Read more >
Python Tutorial: Traversing directories recursively - 2020
I need to find files which have more than one unit of Google Ads (I am supposed to have only one of the...
Read more >
Chapter 14 Creating Packages with Python - Merely Useful
A quick search of the Python Package Index (PyPI) reveals that the package name ... creates a new folder in your project directory...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found