question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Test phase on custom environment

See original GitHub issue

Hi,

I’m experimenting with a custom env on rlpyt. I’ve the intention to use different data for training and testing (env shows novel states on testing/evaluation vs training).

I have been using example_1 as stepping stone, so far so good, but I’m not sure how to achieve this last bit: testing.

After running runner.train() (as in example_1, inside the logger), I think I should use runner.evaluate_agent() inside a loop to evaluate the agent several times (or maybe use eval_max_steps=NumEvals as SerialSampler argument?).

But I’m a bit (more) lost on how to “send” the test signal to my environment from here, as only the SerialSampler ‘knows’ where the environtment class is, and I cannot find a way to use it to send some arguments to the env in this phase.

In short (I don’t know if i’m being clear) I need to know how:

  1. to test/eval my agent
  2. to send a “test” argument to my custom env on testing phase

Thank you!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
astookecommented, May 21, 2020

Actually the evaluation is performed using a separate instance of your environment, and this can be instantiated using different kwargs than the environment used for training. See env_kwargs and eval_env_kwargs into the sampler. Sounds like exactly what you need. 😃

https://github.com/astooke/rlpyt/blob/85d4e018a919118c6e42fac3e897aa346d84b9c5/examples/example_1.py#L29

Thanks for moving your question over!

0reactions
LecJackScommented, Jun 7, 2020

Just for the record ended up solving the last issue like this:

Keep track of episode number on each environment instance (pass log interval as parameters). Define summary writer, and log from environment on each test call, and last 50 train runs before calling to test.

 writer = SummaryWriter()
 epi_len = 65 # Episode number of steps
 log_int = 500 * epi_len

sampler = SerialSampler(
    EnvCls=CustomEnv,
    env_kwargs=dict(id=env_id, mode="train", writer=writer, logInt=log_int),
    eval_env_kwargs=dict(id=env_id, mode="test", writer=writer, logInt=log_int),

And in the environment:

def plot_stats(self, reward):
        # Tensorboard debugging
        if self.mode =='test':
            # Plot all rewards
            self.writer.add_scalars('data/'+str(self.mode), {'reward': reward},
                                                             self.episode)
            # Save last numEvals for statistics
            self.rewardHist[self.numLog%self.numEvals] = reward

            if ((self.episode+1) % self.epiLogInt) == 0:
                # Plot statistics of last self.numEvals
                self.writer.add_scalars('data/'+str(self.mode),
                                    {'mean':   np.mean(self.rewardHist),
                                     'median': np.median(self.rewardHist),
                                     'max':    np.max(self.rewardHist),
                                     'min':    np.min(self.rewardHist)},
                                     self.episode)
                self.writer.add_scalars('data/'+str(self.mode)+"/",
                                            {'std': np.std(self.rewardHist)},
                                             self.episode)
                # Keep same count as training episodes
                self.episode += self.epiLogInt - 1 #self.numEvals
            self.numLog += 1
        else:
            # Want 5 training episodes before the 5 testing episodes
            if self.episode > (self.episode-self.numEvals-1) \
                and ((self.episode-self.numEvals) % self.epiLogInt) >= 0 \
                and ((self.episode-self.numEvals) % self.epiLogInt) < self.numEvals:
                # Plot all rewards
                #self.writer.add_scalars('data/'+str(self.mode), {'reward': reward},
                #                                                 self.episode)
                # Save last numEvals for statistics
                self.rewardHist[self.numLog%self.numEvals] = reward
                # Plot statistics of last self.numEvals
                if (self.numLog%self.numEvals) == (self.numEvals-1):
                    self.writer.add_scalars('data/'+str(self.mode),
                                            {'mean': np.mean(self.rewardHist),
                                             'median': np.median(self.rewardHist),
                                             'max': np.max(self.rewardHist),
                                             'min': np.min(self.rewardHist),
                                             'maxAllTime': self.maxRecord},
                                             self.episode)
                    self.writer.add_scalars('data/'+str(self.mode)+"/",
                                            {'std': np.std(self.rewardHist)},
                                             self.episode)
                self.numLog += 1

Ugly as f*ck, but it works, so I’m closing this issue 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Working with custom test environments in AWS Device Farm
You can set up and configure your custom automated test environment with a ... This phase contains additional commands, if any, that Device...
Read more >
Activate Unit Test in Development and Test Phase and not ...
Goal: Unit testing method TestMethod1 can only be used in development and test phase. If it being used in production phase, everything will ......
Read more >
How Many Test Environments Do I Need?
Having a set of test environments properly configured and managed is essential for modern software organizations.
Read more >
What, Why and How to Build a Custom Environment ... - Medium
Training and Testing. In the training phase, the environment will be explored/exploited to find the optimal policy. Whereas, in the testing ...
Read more >
How to Split JUnit Tests in a Continuous Integration Environment
We gain two phases to setup and tear-down the testing environment, the pre-integration-test and post-integration-test ones. Maven will only fail ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found