Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[feature request] Add total episodes parameter to model learn method

See original GitHub issue

Hi,

TL;DR: I would like the option to pass either a total_episodes parameter or a total_timesteps to the model.learn() method.

Now, for my reasoning. Currently, we can only define the total_timesteps when training an agent, as follows.

model = A2C('MlpLstmPolicy', env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=1000)

However, for some scenarios (e.g., stock trading), it is quite common to have a fixed number of timesteps per episode, given by the available time-series data points. Also, it can be quite valuable to scan all data points an equal amount of time thoroughly and to determine the number of passes, which is defined by the number of episodes.

Thus, to train for a given number of episodes for a fixed number of timesteps, I have to get the total_timesteps value, before passing it to method model.learn() as follows:

desired_total_episodes = 100
n_points = train_df.shape[0]) # get the number of data points
total_timesteps = desired_total_episodes * n_points

Even so, this answer on StackOverflow says that

Where the episode length is known, set it to the desired number of episode you would like to train. However, it might be less because the agent might not (probably wont) reach max steps every time.

I must admit I do not know how accurate this answer, but this worries me that my model may not scan all the data equally.

Another option, as discussed in this issue from previous stable baseline repo, is to use a callback function. Still, for this callback approach, I would have to pass a total_timesteps variable that is high enough so that I can have the desired number of episodes. Hence, this callback approach seems like an out of the way workaround.

In conclusion, I believe that including the option to pass a total_episodes could be a simple and effective approach that would broaden the number of use cases attended by this project.

Thank you for your attention!

Issue Analytics

State:
Created 3 years ago
Reactions:5
Comments:21 (13 by maintainers)

Top GitHub Comments

1reaction

araffincommented, Aug 23, 2020

#115 is now merged with master 😉

1reaction

xicocaiocommented, Aug 22, 2020

I altered my code and now it encompasses the case of multiple envs using the approach discussed here. I also included a test case for this particular situation. Everything is good to go and I followed all instructions on contributing and the PR template.

Still, until #115 merged, I will probably wait to make the PR of the changes on my fork.

Top Results From Across the Web

Reinforcement Learning from Scratch: Applying Model-free ...

A complete guide to Reinforcement Learning and it's parameters in a Python Notebook.

Meta-Learning via Modeling Episode-Level Relationships for ...

Specifically, a novel meta-learning via modeling episode-level relationships (MELR) framework is proposed. By sampling two episodes containing the same set ...

Parameter counts in Machine Learning - AI Alignment Forum

We chose to focus on parameter count because previous work indicates that it is an important variable for model performance [1], ...

Improving Your Machine Learning Models by Adding Features

One easy way to enhance a model is to add a neat new feature (aka input column) or rework an existing one. Let's...

Examples — Stable Baselines 2.10.3a0 documentation

All the following examples can be executed online using Google colab ... If you need to e.g. evaluate same model with multiple different...