Windows: rllib RNN assert seq_lens is not None
See original GitHub issueWhat is the problem?
Using the option “use_lstm” = True ends in an assertion error. It appears that the model is always called with sequence length None. I’m not sure if this is a bug, but according to the documentation adding this option should just wrap the model with an LSTM cell. Weirdly enough it also happens with the cartpole environment. Is this intended behaviour?
`Traceback (most recent call last): File “D:/Seafile/Programming projects/rl_trading/test.py”, line 44, in <module> trainer = sac.SACTrainer(config=config, env=“CartPole-v0”) File “C:\Python38\lib\site-packages\ray\rllib\agents\trainer_template.py”, line 88, in init Trainer.init(self, config, env, logger_creator) File “C:\Python38\lib\site-packages\ray\rllib\agents\trainer.py”, line 479, in init super().init(config, logger_creator) File “C:\Python38\lib\site-packages\ray\tune\trainable.py”, line 245, in init self.setup(copy.deepcopy(self.config)) File “C:\Python38\lib\site-packages\ray\rllib\agents\trainer.py”, line 643, in setup self._init(self.config, self.env_creator) File “C:\Python38\lib\site-packages\ray\rllib\agents\trainer_template.py”, line 101, in _init self.workers = self._make_workers( File “C:\Python38\lib\site-packages\ray\rllib\agents\trainer.py”, line 708, in _make_workers return WorkerSet( File “C:\Python38\lib\site-packages\ray\rllib\evaluation\worker_set.py”, line 66, in init self._local_worker = self._make_worker( File “C:\Python38\lib\site-packages\ray\rllib\evaluation\worker_set.py”, line 259, in _make_worker worker = cls( File “C:\Python38\lib\site-packages\ray\rllib\evaluation\rollout_worker.py”, line 403, in init self._build_policy_map(policy_dict, policy_config) File “C:\Python38\lib\site-packages\ray\rllib\evaluation\rollout_worker.py”, line 986, in _build_policy_map policy_map[name] = cls(obs_space, act_space, merged_conf) File “C:\Python38\lib\site-packages\ray\rllib\policy\tf_policy_template.py”, line 132, in init DynamicTFPolicy.init( File “C:\Python38\lib\site-packages\ray\rllib\policy\dynamic_tf_policy.py”, line 236, in init action_distribution_fn( File “C:\Python38\lib\site-packages\ray\rllib\agents\sac\sac_tf_policy.py”, line 108, in get_distribution_inputs_and_class model_out, state_out = model({ File “C:\Python38\lib\site-packages\ray\rllib\models\modelv2.py”, line 202, in call res = self.forward(restored, state or [], seq_lens) File “C:\Python38\lib\site-packages\ray\rllib\models\tf\recurrent_net.py”, line 157, in forward assert seq_lens is not None AssertionError
Process finished with exit code 1 `
Ray version and other system information (Python version, TensorFlow version, OS): Python 3.8.5 TensorFlow 2.3 Windows 10
Reproduction (REQUIRED)
Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):
`import ray from ray.rllib.agents import sac
config = sac.DEFAULT_CONFIG.copy() config[“num_gpus”] = 1 config[“num_workers”] = 1 config[“framework”] = “tf” config[“model”][“use_lstm”] = True
ray.init(include_dashboard=False)
trainer = sac.SACTrainer(config=config, env=“CartPole-v0”) for i in range(10): result = trainer.train()`
- [x ] I have verified my script runs in a clean environment and reproduces the issue.
- [x ] I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (1 by maintainers)
Hello! Have same problem with PPOTrainer with
use_lstm=True
hello, I have exactly the same thing during inference time with a PPO trainer, It works fine during training but fails on inference… thank you @jobeid1 for your code snippet, I can’t seem to make it work on my side. Is it mandatory to access to the policy of the agent before executing the step method ? I usually do agent.compute_single_action directly I have tried the rc from ray (ray==2.0.0rc0) but the problem still remains 😦 Does the ray team have this on the radar ? All the best and thank you for your contributions !