[Bug] CUDA error when running sample code
See original GitHub issueI am running the basic code to train a PPO agent on CartPole and it gives me an CUDA error:
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasCreate(handle)
in “/.local/lib/python3.6/site-packages/torch/nn/functional.py”, line 1753, in linear return torch._C._nn.linear(input, weight, bias)
Python version 3.6.9
Pytorch version 1.8.0
Code is this:
import gym
from stable_baselines3 import PPO
env = gym.make('CartPole-v1')
model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)
obs = env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
env.render()
if done:
obs = env.reset()
env.close()
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
CUDA error with code=700(cudaErrorIllegalAddress)
CUDA Runtime Problem: CUDA error with code=700(cudaErrorIllegalAddress) I recently bought RTX 3090 Ti for my new desktop and I installed nvidia ...
Read more >RuntimeError: CUDA error: out of memory. Can't run ... - GitHub
Describe the bug "CUDA error: out of memory" was reported when training. ... Can't run the ASR_CTC_Language_Finetuning Tutorial while memory ...
Read more >Bug with Julia 1.7.1 and CUDA 3.3 - GPU
I have a related question. I need to run this code on several clusters, and I need to ensure reproducibility. Is there a...
Read more >"RuntimeError: CUDA error: out of memory" - Stack Overflow
The error occurs because you ran out of memory on your GPU. One way to solve it is to reduce the batch size...
Read more >runtimeerror: cuda error: cublas_status_internal_error when ...
Bug. When I run your code I get the following error: RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Same issue for me, seems to be torch 1.8 related (https://github.com/pytorch/pytorch/issues/53336)
Downgrading to torch 1.7.1 seems to work fine for now.
I switched to cpu by passing ‘cpu’ to PPO, it trains two epochs, then it breaks again and gives me the same CUDA error.
I have two different machines and the code works on neither one.
Anyway I will check my CUDA installation.
Thanks