[Question] Using SubprocVecEnv and Monitor
See original GitHub issueDescribe the bug I’m attempting to combine Monitor feature with SubprocVecEnv. I believe I have done this mostly successfully and my approach is similar to that from the zoo utils. However, the monitor.csv file is written poorly, and I believe this may be an issue with parallel io.
Code example The following code runs without error:
import os
import gym
import numpy as np
from stable_baselines.bench import Monitor
from stable_baselines.common.policies import MlpPolicy
from stable_baselines import PPO2
# Create log dir
if __name__ == '__main__':
log_dir = "tmp/"
os.makedirs(log_dir, exist_ok=True)
cpu = 4
# --- Setup the Environment --- #
from stable_baselines.bench import Monitor
if cpu == 1:
from stable_baselines.common.vec_env import DummyVecEnv
env = DummyVecEnv([lambda: Monitor(gym.make('CartPole-v0'),log_dir,allow_early_resets=True)])
else:
from stable_baselines.common.vec_env import SubprocVecEnv
env = SubprocVecEnv([lambda : Monitor(gym.make('CartPole-v0'),log_dir,allow_early_resets=True) for _ in range(cpu)])
model = PPO2(MlpPolicy, env, verbose=1)
# Train the agent
model.learn(total_timesteps=int(1.0e4))
However, when I examine the produced monitor.csv file, it looks like:
#{"t_start": 1572626492.855354, "env_id": "CartPole-v0"}
r,l,t
39.0,39,0.657071
15.0,15,0.66697
22.0,22,0.680887
941.0,41,0.740440.0,40,0.710295
15.0,15,0.919597
14.0,14,0.928906
35.0,35,0.950049
27.037.0,37,1.005430.0,30,0.983271
34.0,34,1.030959
16.0,16,1.040761
12.0,12,1.048242
12.0,12,1.055426
18.0,18,1.065959
17.0,17,1.07566
38.0,38,1.13172
43.0,43,1.167263
13.0,13,1.176489
13.0,13,1.185495
69.0,69,1.253904
44.0,44,1.28036
15.0,15,1.288848
14.0,14,1.297092
25.0,25,1.311564
41.0,41,1.360075
23.0,23,1.373677
36.0,36,1.395002
32.0,32,1.419716
36.0,36,1.467655
30.0,30,1.484968
17.107.0,107,1.599988
92.0,92,1.653923
41.0,41,1.704438
12.0,12,1.712049
25.0,25,1.729134
21.0,21,1.743471
38.0,38,1.769697
22.0,22,1.804653
49.0,49,1.119.0,1114.0,14,1.8423.0,230.0,30,1.8650.0,50,1.869405
22.0,22,1.886335
16.0,149.0,49,1.9547.0,47,1.930151
41.0,41,1.978087
26.0,26,1.994618
27.032.0,45.0,45,41.0,41,2.74.0,74,2.12287
27.0,27,2.140701
39.0,39,250.0,87.0,87,2.18755.0,50.0,50,2.239721
17.056.0,56,2.250635.067.0,67,2.29009
48.0,48,2.3439056.21.0,21,2.75.0,75,70.0,70,2.47.0,47,55.0,55,2.97.0,97,80.0,80,2.72.0,72,58.0,58,2.55.0,55,75.0,75,2.75.0,75,82.0,82,2.46.0,46,44.0,44,250.0,50,89.0,89,2.797716
2.752339
37.0,37,2.773496
58.0,58,2.479663
54.0,54,2.516737
36.0,36,2.537496
18.0,18,2.5478
44.0,44,2.601305
30.0,30,2.619091
43.0,43,2.644261
89.0,89,2.726325
55.0,55,2.757966
41.0,41,2.806968
18.0,18,2.818015
66.0,66,2.857087
Compared to the output from using a single cpu and a DummyVecEnv, we can see that there is something wrong happening in the SubprocVecEnv case.
#{"t_start": 1572627226.829188, "env_id": "CartPole-v0"}
r,l,t
20.0,20,0.681798
13.0,13,0.6879
32.0,32,0.700244
15.0,15,0.70535
45.0,45,0.72
19.0,19,0.943259
27.0,27,0.952662
21.0,21,0.95922
14.0,14,0.963787
22.0,22,0.971064
13.0,13,0.975076
23.0,23,1.007945
10.0,10,1.011909
29.0,29,1.021509
16.0,16,1.026368
17.0,17,1.032664
30.0,30,1.044385
22.0,22,1.073831
11.0,11,1.07787
13.0,13,1.082761
11.0,11,1.086141
16.0,16,1.091112
29.0,29,1.101255
38.0,38,1.116818
28.0,28,1.147547
18.0,18,1.153371
28.0,28,1.161589
27.0,27,1.17374
12.0,12,1.179716
18.0,18,1.186577
58.0,58,1.23631
23.0,23,1.246039
14.0,14,1.251698
47.0,47,1.290305
11.0,11,1.294172
28.0,28,1.302704
25.0,25,1.310377
16.0,16,1.315227
17.0,17,1.320602
21.0,21,1.35267
25.0,25,1.363953
16.0,16,1.369985
25.0,25,1.378724
40.0,40,1.392686
18.0,18,1.398987
28.0,28,1.427239
36.0,36,1.440283
29.0,29,1.451956
13.0,13,1.456849
52.0,52,1.493982
27.0,27,1.503049
16.0,16,1.508346
17.0,17,1.513618
63.0,63,1.553841
19.0,19,1.559812
48.0,48,1.580166
32.0,32,1.593218
21.0,21,1.620087
23.0,23,1.627711
35.0,35,1.639623
24.0,24,1.649704
55.0,55,1.69098
12.0,12,1.69574
32.0,32,1.707801
21.0,21,1.715795
33.0,33,1.727318
35.0,35,1.758523
89.0,89,1.789826
38.0,38,1.821974
75.0,75,1.849381
25.0,25,1.858357
29.0,29,1.888153
60.0,60,1.908269
17.0,17,1.915471
52.0,52,1.954277
14.0,14,1.958599
85.0,85,1.992599
90.0,90,2.040917
88.0,88,2.0913
120.0,120,2.15459
29.0,29,2.165061
44.0,44,2.183414
29.0,29,2.213457
108.0,108,2.249727
94.0,94,2.302364
157.0,157,2.37942
124.0,124,2.439284
144.0,144,2.53422
128.0,128,2.598712
200.0,200,2.687491
200.0,200,2.802759
134.0,134,2.87133
177.0,177,2.954203
31.0,31,2.964298
92.0,92,3.019974
71.0,71,3.064816
102.0,102,3.099747
136.0,136,3.163223
110.0,110,3.222245
128.0,128,3.290062
131.0,131,3.356284
98.0,98,3.409438
127.0,127,3.475
35.0,35,3.487581
116.0,116,3.549428
119.0,119,3.610692
141.0,141,3.677513
65.0,65,3.719216
200.0,200,3.810465
200.0,200,3.917623
85.0,85,3.966212
80.0,80,3.992697
65.0,65,4.037708
100.0,100,4.094709
103.0,103,4.141736
109.0,109,4.222339
200.0,200,4.354817
75.0,75,4.385399
80.0,80,4.439903
74.0,74,4.491665
93.0,93,4.53067
114.0,114,4.598915
104.0,104,4.660453
70.0,70,4.688232
200.0,200,4.812355
72.0,72,4.862492
108.0,108,4.90083
49.0,49,4.936031
90.0,90,4.9894
139.0,139,5.059209
200.0,200,5.147932
166.0,166,5.227801
57.0,57,5.26566
187.0,187,5.3562
136.0,136,5.426181
200.0,200,5.531484
200.0,200,5.641281
110.0,110,5.680168
135.0,135,5.767416
177.0,177,5.849996
84.0,84,5.899176
45.0,45,5.913942
200.0,200,6.004643
So I’m wondering if this is a known issue and if so, if there is any instrumentation to guard against this and produce correct monitor.csv files with SubprocVecEnv?
System Info Describe the characteristic of your environment:
- OS: mac
- Describe how the library was installed: pip3
- Python version: Python 3.7.4
- Tensorflow version: 1.14.0
- Gym version: 0.15.3
- StableBaselines version: 2.8.0
Issue Analytics
- State:
- Created 4 years ago
- Comments:9
Top GitHub Comments
yes, look quite nice, to answer your question, there are two quick way:
Hello, could you please share how you fixed this problem? I’m currently trying the make_env function but I’m getting an error… I must be doing something wrong. This is what I have: env = SubprocVecEnv([lambda : Monitor(make_env(‘onedof_fdmodel-v0’, rank = i, log_dir = log_dir, env_kwargs = kwargs), filename = log_dir, allow_early_resets=False) for i in range(n_envs)])
Thanks a lot!!!