question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

exclude provided dependencies

See original GitHub issue

What’s the problem this feature will solve?

In some environments requirements are already satisfied by the system runtime. Being able to exclude provided packages would remove the need for these to be installed.

eg:

  • spark applications use pyspark, but at runtime pyspark is already provided. pyspark is a 200MB+ package, so installing it again is inefficient
  • AWS lambda python runtime provides boto3 and other packages
  • appengine provides pyyaml

Describe the solution you’d like

An ability to exclude dependencies, eg:

pip install -r requirements.txt --exclude pyspark

Alternative Solutions

Whilst it’s possible to selectively uninstall the dependency after installation, this still requires downloading and building a wheel (eg: in the case of pyspark) which is inefficient.

Additional context

This feature is similar to the maven provided dependency scope used by java applications.

This issue has previously been raised and closed in https://github.com/pypa/pip/issues/3090

Code of Conduct

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
pfmoorecommented, Sep 10, 2021

So the spark executor adds pyspark to sys.path (somehow) and that’s only visible at runtime, meaning that pip can’t see it? Which means that the environment pip is installing isn’t usable from a “normal” Python interpreter, but only via the spark executor?

That’s not exactly the sort of environment pip expects to be managing (I’m trying hard to avoid saying “unsupported” here, but there’s a line somewhere and this feels close to it to me).

The metadata pip expects is to have a .dist-info directory in one of the locations on sys.path at install time.

Two suggestions:

  1. It sounds like packages that depend on pyspark can only be run within the spark executor. If so, I’d treat that as more of a platform than a dependency, and say that such projects should not declare a dependency on pyspark at all - you’d treat “must be run by a spark executor of such-and-such a version” as an external dependency (much like “must have graphviz installed” for pygraphviz).
  2. Have a “dummy” pyspark package on PyPI, that simply raises an exceptrion at import time saying “You need to run this in the spark executor”. Then the executor inserts its “real” version of pyspark earlier in sys.path, so when run in the executor the code works, but when run outside of the executor, it fails with a useful error at import time.

Hopefully those options are useful for you. I remain -1 on any sort of “ignore dependencies” option for pip as it feels like it’s something that will get abused more often than it’ll be used correctly…

0reactions
pfmoorecommented, Sep 12, 2021

Awesome, I’m glad this helped you find a solution that works for you 🙂

Read more comments on GitHub >

github_iconTop Results From Across the Web

Excluding "provided" dependencies from Maven assembly
I am trying to use the Maven assembly plugin to build a jar-with-dependencies, except those that have ...
Read more >
Maven – Optional Dependencies and Dependency Exclusions
Since Maven resolves dependencies transitively, it is possible for unwanted dependencies to be included in your project's classpath. For example, a certain ...
Read more >
Maven Exclude Dependency | Transitive and Exclusion of ...
Multiple transitive dependencies can be excluded by using the <exclusion> tag for each of the dependency you want to exclude and placing all...
Read more >
Mastering Maven: Dependency Exclusions - Oracle Blogs
In this installment of the Mastering Maven series we have a look at dependency exclusions and their impact in dependency resolution.
Read more >
Exclude dependencies with non-runtime scope (in pom.xml ...
When a Maven project's pom.xml contains dependencies whose scope is "provided", "test" or "system" (see dependency scope reference), IDEA should not include ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found