Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance is dramatically worse for `-r requirements` than without it.

See original GitHub issue

Bug description

It is very slow.

Reproduction steps

pip-audit vs pip-audit -r requirements.txt and the -r version is unbearably slow (I’ve never seen it finish.) So I’ve repro’d the hang in alpine and OFF VPN, so my .pypirc and VPN isn not the cause.

Clone this. https://github.com/matthewdeanmartin/pip_audit_in_docker/

FROM python:3.9-alpine
RUN pip install pip-audit
RUN mkdir -p app
# Thrashing, not sure why it fails
RUN chown root:root /app
RUN chown root:root /tmp
# pip-audit will install ccfi on 1st run, needs C++ etc.
RUN apk add --no-cache libffi-dev build-base
WORKDIR /app
ENTRYPOINT ["pip-audit"]

Build and run like this:

docker build -t pip-audit .
docker run -v $PWD:/app -e PIP_AUDIT_LOGLEVEL=debug pip-audit -r requirements_for_safety.txt

With -r it is very slow (never finishes), without -r it finishes, but it isn’t a scenario I care about- if I have to install malicious code before I can audit it, what good is that, eh?

Expected behavior

It is very fast.

Issue Analytics

State:
Created 2 years ago
Comments:11 (10 by maintainers)

Top GitHub Comments

1reaction

woodruffwcommented, Dec 7, 2021

Nope, you understood correctly! That’s a problem with what I’m proposing, and I don’t have a good answer to it yet.

Braindump follows:

The big problem with our current approach is that we’re essentially running resolvelib N + 1 times: one at the virtualenv level each time we attempt an sdist resolution step (via pip install), and then one final time at the “top” as we collect the entire dependency graph.

Composing them in this way is unsound, since resolvelib is tasked with making global (concretizing) dependency decisions at each individual step while the “top” resolver has its own selection requirements that can’t be simultaneously honored. The end result is #197.

I think there are two ways we can fix this:

We can carry more information in our Resolver implementation, enabling it to backtrack further. Using #197 as an example, the trick might be keeping two sets of resolution information: the concrete versions at each step, and the (narrowed?) constraints at each step. Then, instead of our “top” resolvelib seeing two separate concrete versions of e.g. setuptools, it could instead check the compatibility of the constraints and select the maximal version that satisfies both. I’m not 100% sure what that looks like yet, partially because I don’t fully understand the different knobs that resolvelib offers 😅
We can give up on using resolvelib entirely, and go with a “naive” approach where we just create a new venv and forward the requirement files into it, collecting the fully resolved dependencies at the very end. We could make this roughly as responsive as the current approach by using the communicate() APIs for subprocesses, updating our AuditState each time we receive a line from the underlying pip install process.

1reaction

woodruffwcommented, Dec 7, 2021

Another incremental improvement here: #194 will make pip-audit -r about 10% faster on contrived benchmarks, and probably even faster on real workloads.

Top Results From Across the Web

Is performance supposed to be significantly worse than on ...

Games on Linux usually require a fair bit better CPU than on Windows, due to wrapper overheads, so if your CPU is older...

Childhood obesity: causes and consequences - PMC - NCBI

[28] Obese children are often excluded from activities, particularly competitive activities that require physical activity. It is often difficult for overweight ...

Measuring a Portfolio's Performance - Investopedia

Portfolio performance measures are a key factor in the investment decision. ... Portfolio returns are only part of the story—without evaluating ...

Why cultural safety rather than cultural competency is required ...

Importantly, it is not lack of awareness about 'the culture of other groups' that is driving health care inequities - inequities are primarily ......

Applying Performance and Conduct Standards to Employees ...

May a supervisor require that an employee with a disability perform a job in the same manner as a non-disabled employee? Not necessarily....