question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug Report]: Cudnn_status_execution_failed on RTX GPUs

See original GitHub issue

Bug Description

Running the demo program “cifar10_tutorial.py” gives a runtime error on RTX 2080Ti. The output is showed below:

Using TensorFlow backend. ==> Preparing data… Saving Directory: /tmp/autokeras_160TXP

Initializing search. Initialization finished.

±---------------------------------------------+ | Training model 0 | ±---------------------------------------------+ Epoch-1, Current Metric - 0: 0%| | 0/1921 [00:00<?, ? batch/s]Process ForkProcess-1:

Traceback (most recent call last): … File “/home/xxx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/functional.py”, line 1623, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

However, when I run exactly the same code on Titan Xp. Everything just works perfectly fine.

Reproducing Steps

Steps to reproduce the behavior:

  1. Create a conda virtural env with python==3.6.
  2. pip install tensorflow==1.12.0 tensorflow-gpu==1.12.0 torch==1.0.1 torchvision keras autokeras
  3. Run the demo.

Expected Behavior

Expected to be run normally.

Setup Details

Include the details about the versions of:

  • OS type and version: Ubuntu 16.04
  • Python: 3.6.7
  • autokeras: master
  • scikit-learn: 0.20.2
  • numpy: 1.15.4
  • keras: 2.2.4
  • scipy: 1.2.0
  • tensorflow: 1.12.0
  • pytorch: 1.0.1

Additional context

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:4
  • Comments:5

github_iconTop GitHub Comments

7reactions
vivekpd15commented, Apr 3, 2019

I was able to fix this. I used the latest commit from master. And built it using the git clone method using the bleeding edge (manual) instructions.

Modified the setup.py file to the following after cloning:

`from distutils.core import setup from setuptools import find_packages

setup( name=‘autokeras’, packages=find_packages(exclude=(‘tests’,)), install_requires=[‘scipy==1.2.0’, ‘tensorflow-gpu==1.13.1’, ‘numpy==1.16.1’, ‘scikit-learn==0.20.2’, ‘scikit-image==0.14.2’, ‘tqdm==4.31.0’, ‘imageio==2.5.0’, ‘requests==2.21.0’ ], version=‘0.3.7’, description=‘AutoML for deep learning’, author=‘DATA Lab at Texas A&M University’, author_email=‘jhfjhfj1@gmail.com’, url=‘http://autokeras.com’, download_url=‘https://github.com/keras-team/autokeras/archive/0.3.7.tar.gz’, keywords=[‘AutoML’, ‘keras’], classifiers=[] )`

Notice the change in install_requires

Steps to install:

  1. conda create -c conda-forge -n autokeras python=3.6
  2. conda install -c conda-forge cython
  3. conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
  4. pip install -r requirements.txt
  5. pip install keras
  6. pip install -r requirements.txt
  7. python setup.py install

Hope it helps.

0reactions
stale[bot]commented, Oct 14, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUDNN_STATUS_EXECUTION...
CUDNN_STATUS_EXECUTION_FAILED error when running with TensorRT #440 ... MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, ...
Read more >
How can I fix CUDNN_STATUS_EXECUTION_FAILED ...
Unexpected error calling cuDNN: CUDNN_STATUS_EXECUTION_FAILED". My gpu devide has the following properties (the available memory line ...
Read more >
Error Code 1 - TensorRT
Description I want to try the TensorRT in C++ implementation of ByteTrack in Windows. However, it only supports a method in Linux.
Read more >
Fresh pytorch install, checking if cuda works, gets ...
OS: Ubuntu 18.04 GPU: RTX 2080 Ti with Driver @ 418.43 environment: fresh ... gets RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED.
Read more >
Intel claims nearly half of its GPU bugs were caused by AMD
The report touches on both Intel processors and graphics cards. According to Intel, it has encountered around 50% fewer bugs than AMD, and ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found