ppo2 performance and gpu utilization
See original GitHub issueI am running a ppo2 model. I see high cpu utilization and low gpu utilization.
When running:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
I get:
Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> print(device_lib.list_local_devices())
2019-05-06 11:06:02.117760: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2019-05-06 11:06:02.341488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.92GiB
2019-05-06 11:06:02.348112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-05-06 11:06:02.838521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-06 11:06:02.842724: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-05-06 11:06:02.845154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-05-06 11:06:02.848092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 4641 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 8905916217148098349
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 4866611609
locality {
bus_id: 1
links {
}
}
incarnation: 7192145949653879362
physical_device_desc: "device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5"
]
I understand that tensorflow is “seeing” my gpu. Why is the low utilization when training a stable baseline model?
# multiprocess environment
n_cpu = 4
env = PortfolioEnv(total_steps=settings['total_steps'], window_length=settings['window_length'], allow_short=settings['allow_short'] )
env = SubprocVecEnv([lambda: env for i in range(n_cpu)])
if settings['policy'] == 'MlpPolicy':
model = PPO2(MlpPolicy, env, verbose=0, tensorboard_log=settings['tensorboard_log'])
elif settings['policy'] == 'MlpLstmPolicy':
model = PPO2(MlpLstmPolicy, env, verbose=0, tensorboard_log=settings['tensorboard_log'])
elif settings['policy'] == 'MlpLnLstmPolicy':
model = PPO2(MlpLnLstmPolicy, env, verbose=0, tensorboard_log=settings['tensorboard_log'])
model.learn(total_timesteps=settings['total_timesteps'])
model_name = str(settings['model_name']) + '_' + str(settings['policy']) + '_' + str(settings['total_timesteps']) + '_' + str(settings['total_steps']) + '_' + str(settings['window_length']) + '_' + str(settings['allow_short'])
model.save(model_name)
Issue Analytics
- State:
- Created 4 years ago
- Comments:32 (8 by maintainers)
Top Results From Across the Web
How to run PPO2 baselines on GPU - Quora
How can I ensure that my TensorFlow is running by GPU or not by GPU? The command nvidia-smi doesn't tell if your tensorflow...
Read more >PPO2 — Stable Baselines 2.10.3a0 documentation
PPO2 is the implementation of OpenAI made for GPU. For multiprocessing, it uses vectorized environments compared to PPO1 which uses MPI.
Read more >Deep RL: PPO2 with GPU Baseline | Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from Jane Street Market Prediction.
Read more >The 37 Implementation Details of Proximal Policy Optimization
PPO also has good empirical performance in the arcade learning ... one maintainer suggests ppo2 should offer better GPU utilization by ...
Read more >What Should Your GPU Utilization Be? [Different Workloads ...
If the game is graphically intensive for your GPU or not. For example, running a game with a standard 60 FPS cap won't...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Anyone?
You can throw a bigger network at your problem (by default it is 2 layers of 64), that will use more GPU power and might help your convergence.
from the documentation: