Can't find ptxas binary in ${CUDA_DIR}/bin.
See original GitHub issueDescription
Try to run the reformer model with the configuration reformer_enwik8.gin. Get an error: Can’t find ptxas binary in ${CUDA_DIR}/bin. …
Environment information
OS: Ubuntu 18.04.3 LTS
$ pip freeze | grep tensor
mesh-tensorflow==0.1.7
tensor2tensor==1.15.4
tensorboard==1.15.0
tensorflow-datasets==1.3.2
tensorflow-estimator==1.15.1
tensorflow-gan==2.0.0
tensorflow-gpu==1.15.0
tensorflow-hub==0.7.0
tensorflow-metadata==0.15.2
tensorflow-probability==0.7.0
tensorrt==6.0.1.4
$ pip freeze | grep jax
jax==0.1.57
jaxlib==0.1.37
$ python -V
python 3.6.8
$ nvcc --version
cuda10.0 (/usr/local/cuda --> /usr/local/cuda-10.0, but /usr/local/cuda-10.1 exists)
GPU: 2080TI * 4
For bugs: reproduction and error logs
# Steps to reproduce:
Just run the trainer.py in trax/trax using the configuration reformer_enwiki8.gin.
# Error logs:
[[[!!!! I remove some normal info about dataset]]]
I0119 09:32:55.178084 140128464549696 problem.py:651] Reading data files from /root/tensorflow_datasets/t2t_enwik8_l65k/enwik8_l65k-dev*
INFO:tensorflow:partition: 0 num_data_files: 1
I0119 09:32:55.179685 140128464549696 problem.py:677] partition: 0 num_data_files: 1
I0119 09:32:56.124050 140128464549696 inputs.py:443] Heuristically setting bucketing to False based on shapes of target tensors.
I0119 09:32:56.131589 140128464549696 inputs.py:443] Heuristically setting bucketing to False based on shapes of target tensors.
I0119 09:32:56.136316 140128464549696 inputs.py:443] Heuristically setting bucketing to False based on shapes of target tensors.
I0119 09:33:05.191175 140128464549696 trainer_lib.py:754] Model loaded from ../checkpoints/model.pkl at step 0
Model loaded from ../checkpoints/model.pkl at step 0
I0119 09:33:05.192780 140128464549696 trainer_lib.py:754] Step 0: Starting training using 1 devices
Step 0: Starting training using 1 devices
I0119 09:33:05.194077 140128464549696 trainer_lib.py:754] Step 0: Total number of trainable weights: 215865602
Step 0: Total number of trainable weights: 215865602
2020-01-19 09:33:09.105234: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.105464: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.105489: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.105517: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.105532: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.105554: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.105567: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:09.193084: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.193291: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.193319: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.193338: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.193354: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.193384: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.193418: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:09.345517: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.345708: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.345732: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.345749: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.345762: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.345776: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.345790: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:09.440697: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.440881: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.440903: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.440918: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.440930: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.440941: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.440954: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:09.545554: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.545752: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.545774: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.545791: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.545804: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.545815: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.545827: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:09.730990: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:09.731233: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:09.731260: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:09.731279: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:09.731293: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:09.731305: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:09.731319: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:10.081432: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:10.081621: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:10.081644: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:10.081659: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:10.081671: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:10.081708: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:10.081721: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:13.557328: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:13.557530: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:13.557552: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:13.557567: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:13.557578: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:13.557589: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:13.557601: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:13.633426: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:13.633613: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:13.633636: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:13.633651: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:13.633663: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:13.633700: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:13.633713: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:13.709584: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:13.709778: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:13.709801: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:13.709815: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:13.709826: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:13.709839: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:13.709876: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:14.256316: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:14.256517: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version.
2020-01-19 09:33:14.256540: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2020-01-19 09:33:14.256556: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] ./cuda_sdk_lib
2020-01-19 09:33:14.256568: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] /usr/local/cuda
2020-01-19 09:33:14.256579: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] .
2020-01-19 09:33:14.256591: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-01-19 09:33:31.094227: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:31.094430: W external/org_tensorflow/tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Failed to launch ptxas
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-01-19 09:33:31.177827: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
2020-01-19 09:33:31.255405: E external/org_tensorflow/tensorflow/core/platform/default/subprocess.cc:208] Start cannot fork() child process: Cannot allocate memory
Traceback (most recent call last):
File "/home/xxx/pycharm_proj/trax/trax/trainer.py", line 195, in <module>
app.run(main)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/xxx/pycharm_proj/trax/trax/trainer.py", line 189, in main
trainer_lib.train(output_dir=output_dir)
File "/usr/local/lib/python3.6/dist-packages/gin/config.py", line 1078, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/usr/local/lib/python3.6/dist-packages/gin/utils.py", line 49, in augment_exception_message_and_reraise
six.raise_from(proxy.with_traceback(exception.__traceback__), None)
File "<string>", line 3, in raise_from
File "/usr/local/lib/python3.6/dist-packages/gin/config.py", line 1055, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/xxx/pycharm_proj/trax/trax/supervised/trainer_lib.py", line 641, in train
trainer.train_epoch(epoch_steps, eval_steps)
File "/home/xxx/pycharm_proj/trax/trax/supervised/trainer_lib.py", line 305, in train_epoch
self.train_step(batch)
File "/home/xxx/pycharm_proj/trax/trax/supervised/trainer_lib.py", line 337, in train_step
self._step, opt_state, batch, self._model_state, self._rngs)
File "/usr/local/lib/python3.6/dist-packages/jax/api.py", line 149, in f_jitted
out = xla.xla_call(flat_fun, *args_flat, device=device, backend=backend)
File "/usr/local/lib/python3.6/dist-packages/jax/core.py", line 602, in call_bind
outs = primitive.impl(f, *args, **params)
File "/usr/local/lib/python3.6/dist-packages/jax/interpreters/xla.py", line 442, in _xla_call_impl
compiled_fun = _xla_callable(fun, device, backend, *map(arg_spec, args))
File "/usr/local/lib/python3.6/dist-packages/jax/linear_util.py", line 223, in memoized_fun
ans = call(fun, *args)
File "/usr/local/lib/python3.6/dist-packages/jax/interpreters/xla.py", line 499, in _xla_callable
compiled = built.Compile(compile_options=options, backend=xb.get_backend(backend))
File "/usr/local/lib/python3.6/dist-packages/jaxlib/xla_client.py", line 609, in Compile
return backend.compile(self.computation, compile_options)
File "/usr/local/lib/python3.6/dist-packages/jaxlib/xla_client.py", line 161, in compile
compile_options.device_assignment)
RuntimeError: Internal: Failed to launch ptxas
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Tensorflow 2.4.1 - Couldn't invoke ptxas.exe - Stack Overflow
I got a new fix for this. First I tried using tensorflow=2.3, cudnn=7.6.5 and cudatoolkit=10.1 as mentioned in previous answers.
Read more >Changing Tensorflow PTXAS location - Ask Ubuntu
I am having this wierd issue where I can train the model, but I can't actually get any output because when I run:....
Read more >Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice
Hi all, Just wanted to share my solution to the "Couldn't invoke ptxas --version" error that I got after a recent install of...
Read more >D42642 [CUDA] Detect installation in PATH - LLVM
I.e. binaries go into /usr/bin, headers into /usr/include, bitcode goes somewhere else, etc. ptxas will be found, but we would still fail to ......
Read more >SegFormer Semantic Segmentation cuda error - Models
Error code: 2, command: '"ptxas.exe" "--version"' 2022-08-02 22:06:40.870862: ... Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

This looks like a CUDA or JAX issue - I cannot reproduce and it feels like it’ll be better to ask there. Sorry I cannot help more!
@sanjibnarzary Sorry, my configuration is:
ptxas exists in /usr/local/cuda-10.0/bin and /usr/local/cuda-10.1/bin