Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PPO2 with MlpLstmPolicy crashes GPU

See original GitHub issue

Describe the bug While training using PPO2 with MlpLstmPolicy on custom env, my computer intermittently freezes yet continues training. When I attempt to monitor GPU’s with watch -n0.5 nvidia-smi it loads first gpu data then seems to hang for a while until I see that my second GPU has an error. Even after training, anything that uses a gpu glitches requiring me to reset the computer even to run another model. I’ve run the same training on the same env using just MlpPolicy and it trains just fine (although my problem needs a recurrent network so I always get bad results) and I can monitor everything without GPU glitches. I thought it might be memory overload but I don’t get anywhere near using all the ram or GPU memory.

Code example

import gym
import money_maker
import os

from stable_baselines.common.policies import MlpLstmPolicy, LstmPolicy
from stable_baselines.common.vec_env import SubprocVecEnv
from stable_baselines import PPO2

# multiprocess environment
n_cpu = 128
env = SubprocVecEnv([lambda: gym.make('maker-v0') for i in range(n_cpu)])

model = PPO2(MlpLstmPolicy, env, verbose=1, nminibatches=32,
             tensorboard_log="./ppo2_lstm_21_jan_morn_tensorboard/")

model.learn(total_timesteps=10000000)
model.save("ppo2_maker_lstm")
del model
env.close()

System Info Describe the characteristic of your environment:

installed using PIP according to docs
2- gtx-1080ti GPU’s, driver:410.93,
python version 3.6.5 in conda environment
Tensorflow version: 1.12.0

Additional context

Issue Analytics

State:
Created 5 years ago
Comments:11 (1 by maintainers)

Top GitHub Comments

1reaction

SerialIteratorcommented, Feb 13, 2019

Awesome!!! No more crashing and logs are about 100x smaller. Good job guys!

1reaction

hill-acommented, Jan 29, 2019

Hey,

The logging parameters are usually found in def setup_model(self): in the model file (ex: stable-baselines/ppo2/ppo2.py), and look like tf.summary.[type]([name], [value]).

If you comment everything except the scalars that should reduce by an order of magnitude the logging size.

Top Results From Across the Web

PPO2 — Stable Baselines 2.10.3a0 documentation

PPO2 is the implementation of OpenAI made for GPU. For multiprocessing, it uses vectorized environments compared to PPO1 which uses MPI.

Stable Baselines Documentation - Read the Docs

results need not be reproducible between CPU and GPU executions, ... model = PPO2('MlpLstmPolicy', 'CartPole-v1', nminibatches=1, verbose=1).

[PC] How to Resolve GPU Crashes in Vermintide 2 – Fatshark

Please note that these solutions are intended for players who are experiencing crashes that appear to be GPU-related. Fatshark Support...

GPU randomly crashing while playing games and fully ...

When it crashes I have to use safe mode to disable it but when I enable it again I get the same error....

This GPU crashes when a driver is loaded... But why? - YouTube

Learn more about Kioxia at https://www.kioxia.com/en-us/top.htmlBG4 - https://business.kioxia.com/en-us/ssd/client-ssd/bg4.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

PPO2 with MlpLstmPolicy crashes GPU

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Increasing memory usage throughout run time

Why does env.render() create multiple render screens? | LSTM policy predict with one env [question]