Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[rllib] function unbatch in ray/rllib/utils/spaces/space_utils.py does not work as intended

See original GitHub issue

What is the problem?

Function unbatch in ray/rllib/utils/spaces/space_utils.py produces an error when given the example input in the head-line comment. Ray version: On master (latest commit hash is 66d204e0785619a6c9cef707b796e5804401b6ca) Python version: 3.8.2

Reproduction (REQUIRED)

from ray.rllib.utils.spaces.space_utils import unbatch
unbatch({"a": [1, 2, 3], "b": ([4, 5, 6], [7.0, 8.0, 9.0])})

Traceback is:

File "~/ray/rllib/utils/spaces/space_utils.py", line 124, in unbatch
    for batch_pos in range(len(flat_batches[0])):
TypeError: object of type 'int' has no len()

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

Issue Analytics

State:
Created 3 years ago
Reactions:3
Comments:10 (4 by maintainers)

Top GitHub Comments

2reactions

andras-kthcommented, Sep 29, 2021

This is actually not a bug, only the example in the docstring of the function should be changed to use np.arrays instead of lists. Normally, if you do a (batched) forward pass through some model and compute your actions using that, your results would be dim-0 batched as torch/tf/numpy tensors, so this is ok. This PR here enhances the docstring and gives a better example for using this function: #18967

The world your describing, where all policies return dim-0 batched tensors, seems to be outside our known universe. Have you checked all policies, examples, and documentation? I guarantee that simple Python lists are fairly common; especially in the custom policies section, but also in bona fide RLlib omplementations.

If this is not a bug, then all RLlib policies should be converted; so, they actually return tensors not pure lists, and all examples and documentation needs to be adjusted as well. Quite a bit of work, potentially…

0reactions

sven1977commented, Sep 29, 2021

This is actually not a bug, only the example in the docstring of the function should be changed to use np.arrays instead of lists. Normally, if you do a (batched) forward pass through some model and compute your actions using that, your results would be dim-0 batched as torch/tf/numpy tensors, so this is ok. This PR here enhances the docstring and gives a better example for using this function: https://github.com/ray-project/ray/pull/18967

Top Results From Across the Web

ray.rllib.policy.policy — Ray 2.1.0 - the Ray documentation

It exposes APIs to 1) Compute actions from observation (and possibly other) inputs. 2) Manage the Policy's NN model(s), like exporting and loading...

ray.rllib.algorithms.algorithm — Ray 3.0.0.dev0

We want to make sure the metrics are always present # (although their values may be nan), so that Tune does not complain...

How To Contribute to RLlib — Ray 2.2.0

You can develop RLlib locally without needing to compile Ray by using the setup-dev.py script. This sets up symlinks between the ray/rllib dir...

RLlib: Industry-Grade Reinforcement Learning — Ray 2.2.0

RLlib is an open-source library for reinforcement learning (RL), offering support for production-level, highly distributed RL workloads while maintaining ...

ray.rllib.models.modelv2 — Ray 2.1.0 - the Ray documentation

You can find an runnable example in examples/custom_loss.py. Args: policy_loss: List of or ... TODO: This is unnecessary for when no preprocessor is...