Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to compute custom metrics during training ?

See original GitHub issue

Hello,

In many of your papers you usually compute many metrics during the training process, i.e equality, sustainability, etc. I am trying to compute these metrics for one of your substrates using RLlib. According to This official tutorial, you can use custom callbacks to achieve that.

Theoretically, the on_episode_step callback has a base_env parameter that stores the environment information, and you can simply access the environment data using obs, rewards, dones, infos, off_policy_actions = base_env.poll()

In the case of allelopathic_harvest some useful information is stored as WORLD observations, like in the case of WORLD.WHO_ZAPPED_WHO. However, since the “WORLD” observations are deleted in timestep_to_observations I can not use this data in the RLlib callback.

I tried to send data using the info variable in this script as show below.

  def step(self, action):
    """See base class."""
    actions = [action[agent_id] for agent_id in self._ordered_agent_ids]
    timestep = self._env.step(actions)
    rewards = {
        agent_id: timestep.reward[index]
        for index, agent_id in enumerate(self._ordered_agent_ids)
    }
    done = {'__all__': True if timestep.last() else False}
    info = {"player_0": {"__common__ ": "test"}}

    observations = _timestep_to_observations(timestep)
    return observations, rewards, done, info

But in the callback side I just get the following output. (RolloutWorker pid=21172) {0: {}}. Additionally, I noticed that the rewards obtained in the callback side using base_env.poll() are also empty.

Do you know how can we compute these metrics? I am aware that you don’t use RLlib internally in DeepMind, however I consider that this is the best place to ask.