When using HER + SAC, every call to learn massively decreases performance
See original GitHub issueI didn’t test other algorithms, so I’m not sure if this is a problem with HER, with SAC, with the combination of both or if it’s a problem with all available algorithms.
I first noticed this on my custom environment, so in order to make sure it wasn’t a problem on my end, I also tested it using BitFlippingEnv.
Take the example below:
from stable_baselines import HER, SAC
from stable_baselines.common.bit_flipping_env import BitFlippingEnv
env = BitFlippingEnv(continuous = True)
model = HER(
policy = 'MlpPolicy',
env = env,
model_class = SAC,
n_sampled_goal = 4,
goal_selection_strategy = 'future',
verbose = 1
)
while True:
model.learn(2000, log_interval = 1)
Every subsequent call to learn
will massively impact the algorithm performance. On my computer, the 1st call runs at approximately 315 fps, the 2nd at 275, the 3rd at 150 and the 4th at 50.
Any way to fix this?
Issue Analytics
- State:
- Created 4 years ago
- Comments:6
Top Results From Across the Web
Slow performance when opening / running stories and content ...
The following behavior occurs in SAP Analytics Cloud (SAC): Slow performance when opening / running stories and content.
Read more >ENow Helps Solve Microsoft Teams Call Quality Issues - Fox 59
ENow's new, simplified Microsoft Teams call quality dashboard massively reduces complexity. According to Gartner, Microsoft Teams has more ...
Read more >Functional Changes and Driving Performance in Older Drivers
Further, trunk and neck flexibility, which is essential for looking back during driving, is massively reduced with age, ...
Read more >“Standard” Computers Neural Networks
Memorizing, rather than understanding; The network will be useless with new problems. Few neurons: Lower accuracy; Inability to learn at all. Optimal ...
Read more >Asynchronous Reinforcement Learning for Real-Time Control ...
the time cost of learning updates increases, the action cycle ... tions instead of real-time learning with physical robots. This.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I profiled the call to SAC’s learn method using the lib you linked. The first experiment consists of training for 2000 timesteps 3 times (using the code you posted). The second experiment consists of training for 6000 timesteps 1 time.
By comparing the two, the largest difference seems to be with the call
self.replay_buffer.add(obs, action, reward, new_obs, float(done))
, where the 1st experiment takes 10 times longer than the 2nd experiment.x3 2000ts:
x1 6000ts:
x3 2000ts logs
x1 6000ts logs
Apparently, the problem is solved in v3: https://github.com/hill-a/stable-baselines/issues/845#issuecomment-639754121 because of the new replay buffer implementation.