question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to specify CUDA version in a conda package?

See original GitHub issue

How should a package maintainer specify a dependency on a specific CUDA version like 9.2 or 10.0?

As an example, here is how PyTorch does things today:

  • CUDA 8.0: conda install pytorch torchvision cuda80 -c pytorch
  • CUDA 9.2: conda install pytorch torchvision -c pytorch
  • CUDA 10.0: conda install pytorch torchvision cuda100 -c pytorch
  • No CUDA: conda install pytorch-cpu torchvision-cpu -c pytorch

I believe that NVIDIA and Anaconda handle things differently. I have zero thoughts on which way is correct, but I thought it would be useful to start such a conversation around this. My hope is that we can come to some consensus on packaging conventions that can help users avoid broken environments more easily and provide a good pattern for future package maintainers to follow.

cc @jjhelmus @msarahan @nehaljwani @stuartarchibald @seibert @sklam @soumith @kkraus14 @mike-wendt @datametrician

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:9
  • Comments:44 (25 by maintainers)

github_iconTop GitHub Comments

21reactions
seibertcommented, Dec 20, 2018

CUDA drivers (the part that conda cannot install) are backward compatible with applications compiled with older versions of CUDA. So, for example, the CUDA 9.2 build of PyTorch would only require that CUDA >= 9.2 is present on the system. This backward compatibility also extends to the cudatoolkit (the userspace libraries supplied by NVIDIA which Anaconda already packages), where a conda environment with cudatoolkit 8.0 would work just fine with a system that has the CUDA 9.2 drivers.

So, on one hand, there is motivation (much like glibc) to pick and arbitrary old CUDA and build everything with that, and rely on driver backward compatibility. Aside from new CUDA language features (which project may choose to ignore for compatibility reasons), building with newer CUDA versions can also improve performance as well as add native support for newer hardware. A package compiled for CUDA 8 will not run on Volta GPUs without a lengthy JIT recompilation of all the CUDA functions in the project, which happens automatically, but can still be a bad user experience. As an example, TensorFlow compiled with CUDA 8 can take 10+ minutes to start up on a Volta GPU.

These two conflicting desires for compatibility and performance explain why it makes sense to compile packages with a range of CUDA versions (right now, I’d say 8.0-10 or 9.0 to 10.0 would be the best choice), but still leaves the burden on the user to know which CUDA version they need.

Because nearly all CUDA projects require the CUDA toolkit libraries, and Anaconda packages them, we use the cudatoolkit package as our CUDA version marker. So for packages in Anaconda that require CUDA, we make them depend on a specific cudatoolkit version. This allows you to force a specific CUDA version this way:

conda install pytorch cudatoolkit=8.0

And that will get you a PyTorch compiled with CUDA 8, rather than something else.

The CUDA driver provides a C API to query what maximum version of CUDA is supported by the driver, so a few months ago I wrote a self-contained Python function for detecting what version of CUDA (if any) is present on the system:

https://gist.github.com/seibert/52a204395cdc84eeeaf0ce05464a636b

This was for the conda team to potentially incorporate into conda as a “marker” (I think that is the right term), so that conda could include a cuda package with a version given by this function in the dependency solver. That would then give everyone a standard way to refer to the system CUDA dependency.

I don’t know where this work is on the roadmap for conda (@msarahan?), but if there is additional work needed on the conda side to get this to the finish line, I’m happy to help. It would go a long way toward unifying the various approaches as well as improving the user experience.

2reactions
jjhelmuscommented, Feb 3, 2019

The addition of the micro version was intentional. NVIDIA labels CUDA releases with a micro version and I think in the past has released multiple micro versions for a given major.minor version. With the previous cudatoolkit packages there was not method to differentiate these changes. The addition of the micro version to cudatoolkit 10.0.130 is more specific and allows for updates if a new micro version is released. Package builder and users should still specify the version by the major.minor version, e.g. conda install cudatoolkit=10.0, conda will automatically provide the micro version.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Managing CUDA dependencies with Conda | by David R. Pugh
Yep! You can use the conda search command to see what versions of the NVIDIA CUDA Toolkit are available from the default channels....
Read more >
Working with GPU packages - Anaconda Documentation
GPU -enabled packages are built against a specific version of CUDA. Currently supported versions include CUDA 8, 9.0 and 9.2. The NVIDIA drivers...
Read more >
Anaconda reading wrong CUDA version - Stack Overflow
Any solution to this? EDIT: When running this code sample: # setting device on GPU if available, else CPU device = torch.device( ...
Read more >
Install conda and set up a Pytorch 1.7, CUDA 11.1 ...
The first line creates our environment called “PyTorch” and you can select the python version (I choose version 3.7). · The second line...
Read more >
Compiling CUDA code while using conda environments
Conda is a powerful package manager that is commonly used to create ... This requirement only specified the major version, so to see...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found