[rllib] function unbatch in ray/rllib/utils/spaces/space_utils.py does not work as intended
See original GitHub issueWhat is the problem?
Function unbatch
in ray/rllib/utils/spaces/space_utils.py
produces an error when given the example input in the head-line comment.
Ray version: On master (latest commit hash is 66d204e0785619a6c9cef707b796e5804401b6ca)
Python version: 3.8.2
Reproduction (REQUIRED)
from ray.rllib.utils.spaces.space_utils import unbatch
unbatch({"a": [1, 2, 3], "b": ([4, 5, 6], [7.0, 8.0, 9.0])})
Traceback is:
File "~/ray/rllib/utils/spaces/space_utils.py", line 124, in unbatch
for batch_pos in range(len(flat_batches[0])):
TypeError: object of type 'int' has no len()
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:10 (4 by maintainers)
Top Results From Across the Web
ray.rllib.policy.policy — Ray 2.1.0 - the Ray documentation
It exposes APIs to 1) Compute actions from observation (and possibly other) inputs. 2) Manage the Policy's NN model(s), like exporting and loading...
Read more >ray.rllib.algorithms.algorithm — Ray 3.0.0.dev0
We want to make sure the metrics are always present # (although their values may be nan), so that Tune does not complain...
Read more >How To Contribute to RLlib — Ray 2.2.0
You can develop RLlib locally without needing to compile Ray by using the setup-dev.py script. This sets up symlinks between the ray/rllib dir...
Read more >RLlib: Industry-Grade Reinforcement Learning — Ray 2.2.0
RLlib is an open-source library for reinforcement learning (RL), offering support for production-level, highly distributed RL workloads while maintaining ...
Read more >ray.rllib.models.modelv2 — Ray 2.1.0 - the Ray documentation
You can find an runnable example in examples/custom_loss.py. Args: policy_loss: List of or ... TODO: This is unnecessary for when no preprocessor is...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The world your describing, where all policies return dim-0 batched tensors, seems to be outside our known universe. Have you checked all policies, examples, and documentation? I guarantee that simple Python lists are fairly common; especially in the custom policies section, but also in bona fide RLlib omplementations.
If this is not a bug, then all RLlib policies should be converted; so, they actually return tensors not pure lists, and all examples and documentation needs to be adjusted as well. Quite a bit of work, potentially…
This is actually not a bug, only the example in the docstring of the function should be changed to use np.arrays instead of lists. Normally, if you do a (batched) forward pass through some model and compute your actions using that, your results would be dim-0 batched as torch/tf/numpy tensors, so this is ok. This PR here enhances the docstring and gives a better example for using this function: https://github.com/ray-project/ray/pull/18967