Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"error: cuda_runtime_api.h: No such file or directory"

See original GitHub issue

Hello, I’m trying to run the basic example. I have several LLMs working and have used Huggingface Hub to download them, for reference. However, I get this error in the title. Indeed this file is not found in: /home/user/.local/lib/python3.10/site-packages/torch/include/c10/I did find it here: /usr/local/cuda-11.7/targets/x86_64-linux/include/cuda_runtime_api.h

I had a challenging time getting my nvidia driver to work with the right cuda version during torch install. Current PyTorch version is: Version: 1.12.1+cu116. You can see the version 11.7 in the above path. I’m not sure how relevant that is, but this is the only combination of cuda and torch versions I could get working. I think c10 denotes the default version of torch installed with python 3.10 on Ubuntu 22.04. Which is supported by this quote from SE:

“PyTorch doesn’t use the system’s CUDA library. When you install PyTorch using the precompiled binaries using either pip or conda it is shipped with a copy of the specified version of the CUDA library which is installed locally.”

The output does say: Installed CUDA version 11.7 does not match the version torch was compiled with 11.6 but since the APIs are compatible, accepting this combination Using /home/user/.cache/torch_extensions/py310_cu116 as PyTorch extensions root...

Do I need to set some environment vars and/or install another version of PyTorch in a virtualenv? I’m a little short on space, so hopping not. It seems there is some conflict between the default PyTorch c10 locations and the discovered 11.6/11.7 version of Cuda.

Quick side note: the models downloaded to /tmp/mii_models. Is it possible to use the standard Huggingface model locations?

Issue Analytics

State:
Created a year ago
Comments:15 (7 by maintainers)

Top GitHub Comments

1reaction

auwsomcommented, Oct 27, 2022

@mrwyattii also, sorry, I didnt realize that collapsing of the error above broke the formatting. It doesnt do that on GH Gist. So here it is again if it helps, but it is just the same error. I’ll try to clone this VM and run another torch install on the HDD after expanding.

https://gist.github.com/auwsom/2faf04fc8280685a3342e87a32402113

0reactions

mrwyattiicommented, Nov 21, 2022

closing this due to inactivity, please reopen if there are updates.