RLLib example not using conv as configured
See original GitHub issueWhile convolutions are specified in the RLLib example the actual models that are spawned are LSTM+FC only.
The configuration in https://github.com/deepmind/meltingpot/blob/main/examples/rllib/self_play_train.py:
config["model"]["conv_filters"] = [[16, [8, 8], 8], [128, [11, 11], 1]]
config["model"]["conv_activation"] = "relu"
The log of the architecture of one agent:
(RolloutWorker pid=287877) Model: "model_9"
(RolloutWorker pid=287877) __________________________________________________________________________________________________
(RolloutWorker pid=287877) Layer (type) Output Shape Param # Connected to
(RolloutWorker pid=287877) ==================================================================================================
(RolloutWorker pid=287877) seq_in (InputLayer) [(None,)] 0 []
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [()] 0 ['seq_in[0][0]']
(RolloutWorker pid=287877) k/Max (TensorFlowOpLayer)
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [()] 0 ['tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) k/Maximum (TensorFlowOpLayer) /Max[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [(None, 1)] 0 ['seq_in[0][0]']
(RolloutWorker pid=287877) k/ExpandDims (TensorFlowOpLaye
(RolloutWorker pid=287877) r)
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [(None,)] 0 ['tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) k/Range (TensorFlowOpLayer) /Maximum[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [(None, 1)] 0 ['tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) k/Cast (TensorFlowOpLayer) /ExpandDims[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) inputs (InputLayer) [(None, None, 267)] 0 []
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) h (InputLayer) [(None, 256)] 0 []
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) c (InputLayer) [(None, 256)] 0 []
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) tf_op_layer_av_wk1/SequenceMas [(None, None)] 0 ['tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) k/Less (TensorFlowOpLayer) /Range[0][0]',
(RolloutWorker pid=287877) 'tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) /Cast[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) lstm (LSTM) [(None, None, 256), 536576 ['inputs[0][0]',
(RolloutWorker pid=287877) (None, 256), 'h[0][0]',
(RolloutWorker pid=287877) (None, 256)] 'c[0][0]',
(RolloutWorker pid=287877) 'tf_op_layer_av_wk1/SequenceMask
(RolloutWorker pid=287877) /Less[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) logits (Dense) (None, None, 11) 2827 ['lstm[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) values (Dense) (None, None, 1) 257 ['lstm[0][0]']
(RolloutWorker pid=287877)
(RolloutWorker pid=287877) ==================================================================================================
(RolloutWorker pid=287877) Total params: 539,660
(RolloutWorker pid=287877) Trainable params: 539,660
(RolloutWorker pid=287877) Non-trainable params: 0
(RolloutWorker pid=287877) __________________________________________________________________________________________________
(RolloutWorker pid=287877) ...
I think that this is caused by how the observation spaces are flattened in RLLib.
pip freeze
absl-py==1.0.0
aiohttp==3.8.1
aiohttp-cors==0.7.0
aioredis==1.3.1
aiosignal==1.2.0
astunparse==1.6.3
async-timeout==4.0.2
attrs==21.4.0
blessed==1.19.1
cachetools==5.0.0
certifi==2021.10.8
charset-normalizer==2.0.12
chex==0.1.1
click==8.0.4
cloudpickle==2.0.0
colorful==0.5.4
contextlib2==21.6.0
cycler==0.11.0
Deprecated==1.2.13
dm-env==1.5
-e git+https://github.com/deepmind/meltingpot@79f8756389d590c2b965de37c9b54cdf8679f7a7#egg=dm_meltingpot
dm-tree==0.1.6
dmlab2d @ https://github.com/deepmind/lab2d/releases/download/release_candidate_2021-07-13/dmlab2d-1.0-cp39-cp39-manylinux_2_31_x86_64.whl
filelock==3.6.0
flatbuffers==2.0
fonttools==4.30.0
frozenlist==1.3.0
gast==0.5.3
google-api-core==2.7.1
google-auth==2.6.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
googleapis-common-protos==1.55.0
gpustat==1.0.0b1
grpcio==1.43.0
gym==0.21.0
h5py==3.6.0
hiredis==2.0.0
idna==3.3
imageio==2.16.1
immutabledict==2.2.1
importlib-metadata==4.11.3
jax==0.3.1
jaxlib==0.3.0
jsonschema==4.4.0
keras==2.8.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.0
libclang==13.0.0
lz4==4.0.0
Markdown==3.3.6
matplotlib==3.5.1
ml-collections==0.1.1
msgpack==1.0.3
multidict==6.0.2
networkx==2.7.1
nose==1.3.7
numpy==1.22.3
nvidia-ml-py3==7.352.0
oauthlib==3.2.0
opencensus==0.8.0
opencensus-context==0.1.2
opt-einsum==3.3.0
packaging==21.3
pandas==1.4.1
Pillow==9.0.1
prometheus-client==0.13.1
protobuf==3.19.4
psutil==5.9.0
py-spy==0.3.11
pyasn1==0.4.8
pyasn1-modules==0.2.8
pygame==2.1.2
pyparsing==3.0.7
pyrsistent==0.18.1
python-dateutil==2.8.2
pytz==2021.3
PyWavelets==1.3.0
PyYAML==6.0
ray==1.11.0
redis==4.1.4
requests==2.27.1
requests-oauthlib==1.3.1
rsa==4.8
Rx==3.2.0
scikit-image==0.19.2
scipy==1.8.0
six==1.16.0
smart-open==5.2.1
tabulate==0.8.9
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5
tensorflow==2.8.0
tensorflow-io-gcs-filesystem==0.24.0
termcolor==1.1.0
tf-estimator-nightly==2.8.0.dev2021122109
tifffile==2022.2.9
toolz==0.11.2
typing_extensions==4.1.1
urllib3==1.26.8
wcwidth==0.2.5
Werkzeug==2.0.3
wrapt==1.14.0
yarl==1.7.2
zipp==3.7.0
Issue Analytics
- State:
- Created 2 years ago
- Comments:5
Top Results From Across the Web
Models, Preprocessors, and Action Distributions — Ray 2.2.0
Example : # Use None for making RLlib try to find a default filter setup given the # observation space. "conv_filters": None, #...
Read more >How to specify conv_filters when using a custom obs shape?
When I try to run my train script without specifying the conv_filters I receive this error: ValueError: No default configuration for obs shape...
Read more >ray/visionnet.py at master · ray-project/ray - rllib - GitHub
"""Generic vision network implemented in ModelV2 API. An additional post-conv fully connected stack can be added and configured. via the ...
Read more >Reinforcement Learning with RLLib - Griddly Docs
Some examples may be modified to work with Tensorflow, but we do not provide explicit support for Tensorflow. Examples Setup . Griddly installs...
Read more >Action Masking with RLlib - DataHubbs
Action masking is powerful, but not straightforward with Ray/RLlib. Here we walk through a simple example to verify correct implementation.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Good catch! Do you know how to fix it? Could you show us how? None of us are experts on RLLib…
Yes