question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] CUDA error when running sample code

See original GitHub issue

I am running the basic code to train a PPO agent on CartPole and it gives me an CUDA error:

RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasCreate(handle)

in “/.local/lib/python3.6/site-packages/torch/nn/functional.py”, line 1753, in linear return torch._C._nn.linear(input, weight, bias)

Python version 3.6.9

Pytorch version 1.8.0

Code is this:

import gym

from stable_baselines3 import PPO

env = gym.make('CartPole-v1')

model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

obs = env.reset()
for i in range(1000):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, done, info = env.step(action)
    env.render()
    if done:
      obs = env.reset()

env.close()

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
ac-93commented, Mar 10, 2021

Same issue for me, seems to be torch 1.8 related (https://github.com/pytorch/pytorch/issues/53336)

Downgrading to torch 1.7.1 seems to work fine for now.

1reaction
youryzcommented, Mar 8, 2021

Actually no @Miffyli , the code araffin provides works. My own deep learning projects based on Pytorch also works (MLP training & inference). What I mean sample code is the code in my initial post, which is used to train a PPO agent, and I post it again here:

import gym

from stable_baselines3 import PPO

env = gym.make('CartPole-v1')

model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

obs = env.reset()
for i in range(1000):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, done, info = env.step(action)
    env.render()
    if done:
      obs = env.reset()

env.close()

I’ve checked my Pytorch/CUDA and there’s no problem with them. I will check again in case. But I don’t think there should be any problem with them since my past projects worked well.

the error that you have is definitely a pytorch/cuda error. I have tested SB3 with python 3.6 and pytorch 1.8 on my machine and on a google colab instance and did not get any error… i guess the code will also run if you use the cpu only.

I switched to cpu by passing ‘cpu’ to PPO, it trains two epochs, then it breaks again and gives me the same CUDA error.

I have two different machines and the code works on neither one.

Anyway I will check my CUDA installation.

Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUDA error with code=700(cudaErrorIllegalAddress)
CUDA Runtime Problem: CUDA error with code=700(cudaErrorIllegalAddress) I recently bought RTX 3090 Ti for my new desktop and I installed nvidia ...
Read more >
RuntimeError: CUDA error: out of memory. Can't run ... - GitHub
Describe the bug "CUDA error: out of memory" was reported when training. ... Can't run the ASR_CTC_Language_Finetuning Tutorial while memory ...
Read more >
Bug with Julia 1.7.1 and CUDA 3.3 - GPU
I have a related question. I need to run this code on several clusters, and I need to ensure reproducibility. Is there a...
Read more >
"RuntimeError: CUDA error: out of memory" - Stack Overflow
The error occurs because you ran out of memory on your GPU. One way to solve it is to reduce the batch size...
Read more >
runtimeerror: cuda error: cublas_status_internal_error when ...
Bug. When I run your code I get the following error: RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found