question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

apex not supporting CUDA 11.0? [Help me]

See original GitHub issue

My nvcc version is cuda 11.0, but I found the pytorch latest version from this website is 10.2 As a result I can’t properly install apex.

ImportError: cannot import name ‘amp’

Software Versions pre-installed:

Nvidia Driver: 450.51v
CUDA: 11v
cuDNN: 8.0v
Python: 3.8
Docker: 19.03.12v
Nvidia-docker: 2.0v
NGC(Nvidia GPU Cloud) CLI: 1.15.0v

i followed this commands:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

when normally import apex, it is working.

python -c “import apex”

but in the main program, not working.

Traceback (most recent call last):
  File "train.py", line 188, in <module>
    train(num_gpus, args.rank, args.group_name, **train_config)
  File "train.py", line 83, in train
    from apex import amp
ImportError: cannot import name 'amp'

not importing apex module.

Please help me to solve this issue @definitelynotmcarilli @thorjohnsen @mcarilli @kexinyu @ptrblck 😃

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:24 (8 by maintainers)

github_iconTop GitHub Comments

26reactions
stas00commented, Nov 13, 2020

Alternatively, since only the minor version differs, you could also try to disable the minor version check (you should get an error with a link to more information in your first run), and rebuilt it.

There is no option to do that, so I had to hack setup.py to disable the check:

diff --git a/setup.py b/setup.py
index 063b42d..9eabb49 100644
--- a/setup.py
+++ b/setup.py
@@ -91,6 +91,7 @@ def get_cuda_bare_metal_version(cuda_dir):
     return raw_output, bare_metal_major, bare_metal_minor

 def check_cuda_torch_binary_vs_bare_metal(cuda_dir):
+    return
$ pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
[...]
Successfully installed apex-0.1

So I successfully built apex against system-wide cuda-11.1, while having pytorch w/ cuda-11.0 installed,

Yay!

And it works just fine!

Thank you, @ptrblck!

8reactions
stas00commented, Nov 11, 2020

How can I tell apex to use cuda-11.0? I have both cuda-11.0 and cuda-11.1 installed and it fails to build as it doesn’t find cuda-11.0:

    Compiling cuda extensions with
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2020 NVIDIA Corporation
    Built on Mon_Oct_12_20:09:46_PDT_2020
    Cuda compilation tools, release 11.1, V11.1.105
    Build cuda_11.1.TC455_06.29190527_0
    from /usr/local/cuda-11.1/bin

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-req-build-yz2qpdod/setup.py", line 152, in <module>
        check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
      File "/tmp/pip-req-build-yz2qpdod/setup.py", line 102, in check_cuda_torch_binary_vs_bare_metal
        raise RuntimeError("Cuda extensions are being compiled with a version of Cuda that does " +
    RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 11.0.

Also would it be possible to make apex builds on conda-forge for cuda11.0 and cuda11.1?

Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUDA 11.0 - Hacker News
The thing is, no library you use is going to be supporting the CUDA 11.0 RC, that's ridiculous. For example, Pytorch stable is...
Read more >
installing NVIDIA Apex for Python 3.8.5 and compatible with ...
Installing CUDA 11.1 and then adding the following to ~/.bashrc and sourcing the ~/.bashrc and finally the symlink made it work:
Read more >
[D] Does cuda latest version support all version of pytorch and ...
I want to to know if i install cuda 11.5, will it support lower version tensorflow or torch packages such as tensorflow 2.4...
Read more >
CUDA Compatibility :: NVIDIA Data Center GPU Driver ...
CUDA 11 and Later Defaults to Minor Version Compatibility ... CUDA Toolkit releases were not supported on older drivers without forward ...
Read more >
Cuda not compatible with PyTorch installation error while ...
Package cudatoolkit conflicts for: torchvision==0.9.0 -> cudatoolkit[version='10.2|10.2.|11.0|11.0.|11.1|11.1.|>=10.1 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found