question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to determine best cudnn convolution algorithm/No GPU/TPU found

See original GitHub issue

RTX3080 / cuda11.1/cudnn 8.2.1/ubuntu16.04

This problem occurs in jaxlib-0.1.72+cuda111. When I update to 0.1.74, it will disappear. However, in 0.1.74, Jax cannot detect the existence of GPU, and tensorflow can

Therefore, whether I use 0.1.72 or 0.1.74, there is always a problem with me

`RuntimeError: UNKNOWN: Failed to determine best cudnn convolution algorithm: INTERNAL: All algorithms tried for %custom-call.1 = (f32[1,112,112,64]{2,1,3,0}, u8[0]{0}) custom-call(f32[1,229,229,3]{2,1,3,0} %pad, f32[7,7,3,64]{1,0,2,3} %copy.4), window={size=7x7 stride=2x2}, dim_labels=b01f_01io->b01f, custom_call_target=“__cudnn$convForward”, metadata={op_type=“conv_general_dilated” op_name=“jit(conv_general_dilated)/conv_general_dilated[\n batch_group_count=1\n dimension_numbers=ConvDimensionNumbers(lhs_spec=(0, 3, 1, 2), rhs_spec=(3, 2, 0, 1), out_spec=(0, 3, 1, 2))\n feature_group_count=1\n lhs_dilation=(1, 1)\n lhs_shape=(1, 224, 224, 3)\n padding=((2, 3), (2, 3))\n precision=None\n preferred_element_type=None\n rhs_dilation=(1, 1)\n rhs_shape=(7, 7, 3, 64)\n window_strides=(2, 2)\n]” source_file=“/media/node/Materials/anaconda3/envs/xmcgan/lib/python3.9/site-packages/flax/linen/linear.py” source_line=282}, backend_config=“{"algorithm":"0","tensor_ops_enabled":false,"conv_result_scale":1,"activation_mode":"0","side_input_scale":0}” failed. Falling back to default algorithm.

Convolution performance may be suboptimal. To ignore this failure and try to use a fallback algorithm, use XLA_FLAGS=–xla_gpu_strict_conv_algorithm_picker=false. Please also file a bug for the root cause of failing autotuning. `

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:12
  • Comments:12 (2 by maintainers)

github_iconTop GitHub Comments

11reactions
half-potatocommented, Mar 16, 2022

Turns out it was an OOM error, just a bad error message. Solution is in #8506. use the environment flag XLA_PYTHON_CLIENT_MEM_FRACTION=0.87. It appears that there is some kind of issue with how jax.scipy.signal.convolve2d handles preallocated memory. I believe it would be nice to have a better error message for this.

7reactions
ross-Hrcommented, Jan 4, 2022

Do you fix the error ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Failed to get convolution algorithm. This is probably because ...
If using Conda environments, in my case the issue was solved by installing tensorflow-gpu and not CUDAtoolkit nor cuDNN because they are already ......
Read more >
CUDNN ERROR: Failed to get convolution algorithm
I am attempting to install the OS-agnostic version of the most recent NCCL. This is bringing a new error: ldconfig lists NCCL, but...
Read more >
Failed to get convolution algorithm. This is probably because ...
Everuting runs fine without the GPU accelerator. Tried a lot downloaded some \cudnn-10.0-windows10-x64-v7.3.1.20.zip and did the manual coy past ...
Read more >
Failed to determine best cudnn convolution ... - Issues Antenna
UNKNOWN: Failed to determine best cudnn convolution algorithm: UNKNOWN: GetConvolveAlgorithms failed. I am trying to run the code locally on a device with...
Read more >
CUDNN ERROR: Failed to get convolution ... - Newbedev
You have cache issues I regularly work around this error by shutting down. ... I'd go back and set up CUDA + TensorFlow...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found