[Bug Report]: Cudnn_status_execution_failed on RTX GPUs
See original GitHub issueBug Description
Running the demo program “cifar10_tutorial.py” gives a runtime error on RTX 2080Ti. The output is showed below:
Using TensorFlow backend. ==> Preparing data… Saving Directory: /tmp/autokeras_160TXP
Initializing search. Initialization finished.
±---------------------------------------------+ | Training model 0 | ±---------------------------------------------+ Epoch-1, Current Metric - 0: 0%| | 0/1921 [00:00<?, ? batch/s]Process ForkProcess-1:
Traceback (most recent call last): … File “/home/xxx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/functional.py”, line 1623, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
However, when I run exactly the same code on Titan Xp. Everything just works perfectly fine.
Reproducing Steps
Steps to reproduce the behavior:
- Create a conda virtural env with python==3.6.
- pip install tensorflow==1.12.0 tensorflow-gpu==1.12.0 torch==1.0.1 torchvision keras autokeras
- Run the demo.
Expected Behavior
Expected to be run normally.
Setup Details
Include the details about the versions of:
- OS type and version: Ubuntu 16.04
- Python: 3.6.7
- autokeras: master
- scikit-learn: 0.20.2
- numpy: 1.15.4
- keras: 2.2.4
- scipy: 1.2.0
- tensorflow: 1.12.0
- pytorch: 1.0.1
Additional context
Issue Analytics
- State:
- Created 5 years ago
- Reactions:4
- Comments:5
Top GitHub Comments
I was able to fix this. I used the latest commit from master. And built it using the git clone method using the bleeding edge (manual) instructions.
Modified the setup.py file to the following after cloning:
`from distutils.core import setup from setuptools import find_packages
setup( name=‘autokeras’, packages=find_packages(exclude=(‘tests’,)), install_requires=[‘scipy==1.2.0’, ‘tensorflow-gpu==1.13.1’, ‘numpy==1.16.1’, ‘scikit-learn==0.20.2’, ‘scikit-image==0.14.2’, ‘tqdm==4.31.0’, ‘imageio==2.5.0’, ‘requests==2.21.0’ ], version=‘0.3.7’, description=‘AutoML for deep learning’, author=‘DATA Lab at Texas A&M University’, author_email=‘jhfjhfj1@gmail.com’, url=‘http://autokeras.com’, download_url=‘https://github.com/keras-team/autokeras/archive/0.3.7.tar.gz’, keywords=[‘AutoML’, ‘keras’], classifiers=[] )`
Notice the change in install_requires
Steps to install:
Hope it helps.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.