cml workflow with gpu fails with LD_LIBRARY_PATH error
See original GitHub issueUsing a cml GitHub workflow with docker://dvcorg/cml:0-dvc2-base1-gpu
container fails to utilise GPU due to LD_LIBRARY_PATH error:
2021-07-12 10:52:46.210323: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-07-12 10:52:47.495637: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-12 10:52:47.496273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:00:1e.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
2021-07-12 10:52:47.496789: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-07-12 10:52:47.496915: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-07-12 10:52:47.498149: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-07-12 10:52:47.498508: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-07-12 10:52:47.501456: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-07-12 10:52:47.501637: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-07-12 10:52:47.501773: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-07-12 10:52:47.501792: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:17 (6 by maintainers)
Top Results From Across the Web
troubles caused by tensorflow image's LD_LIBRARY_PATH
In OpenDCOS, before mesos-agent startup, it sets its executor's environment variable LD_LIBRARY_PATH to "/opt/mesosphere/lib", so that executor ...
Read more >Bug listing with status RESOLVED with resolution UPSTREAM ...
Bug :6292 - "loadkeys broken, or kernel memory garbled!!! ... Bug:53710 - "nvidia drivers failing with USE=pie xorg-x11" status:RESOLVED resolution:UPSTREAM ...
Read more >GPU Accelaration and libnvidia-ml.so - Ansys Learning Forum
Hi,. I want to run a simulation with GPU acceleration but got following error in Linux environment, can you help me about how...
Read more >8.4. Emulating Your OpenCL Kernel - Intel
To emulate your kernel, perform the following steps: Required: Modify your host program to select the emulator OpenCL platform. Select the ...
Read more >Intoli Joins the NVIDIA Inception Program
[ec2-user@ip-172-31-6-82 ~]$ google-chrome-stable --headless --disable-gpu --print-to-pdf https://www.orf.at [1029/160518.999058:ERROR:bus.cc(422)] Failed ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thank you for your measured response @0x2b3bfa0, that was quite a confusing and unhelpful message!
Ah - my mistake, actually what I did was the following, actually the reverse of what is mentioned in that comment (I missed this). https://stackoverflow.com/a/67642774