Eager execution is failed with RNN
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
- Ray installed from (source or binary): Binary
- Ray version: 0.7.6
- Python version: 3.7.5
- Exact command to reproduce:
Run
rllib/examples/custom_keras_rnn_model.py
after addingeager: True
in tune config.
Describe the problem
I wanted to test TF eager execution with rllib/examples/custom_keras_rnn_model.py
, but failed. The assertion in make_tf_callable()
is failed because tf.executing_eagerly()
returns False
even on the eager mode. After some debugging, I found out that tf.executing_eagerly()
starts to work wrong after executing rllib/models/catalog.py:258
, which accesses tune.registry._global_registry
. However, this situation does not occur without RNN, for example when running rllib/examples/custom_keras_model.py
.
Source code / logs
2019-12-09 10:17:13,116 ERROR trial_runner.py:569 -- Error processing event.
Traceback (most recent call last):
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 515, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 351, in fetch_result
result = ray.get(trial_future[0])
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/worker.py", line 2121, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AssertionError): ray_PPO:train() (pid=16838, host=daewoo-linux)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 90, in __init__
Trainer.__init__(self, config, env, logger_creator)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 372, in __init__
Trainable.__init__(self, config, logger_creator)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/tune/trainable.py", line 96, in __init__
self._setup(copy.deepcopy(self.config))
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 492, in _setup
self._init(self.config, self.env_creator)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 109, in _init
self.config["num_workers"])
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 537, in _make_workers
logdir=self.logdir)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 64, in __init__
RolloutWorker, env_creator, policy, 0, self._local_config)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 220, in _make_worker
_fake_sampler=config.get("_fake_sampler", False))
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 351, in __init__
policy_dict, policy_config)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 764, in _build_policy_map
policy_map[name] = cls(obs_space, act_space, merged_conf)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/policy/eager_tf_policy.py", line 244, in __init__
before_loss_init(self, observation_space, action_space, config)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo_policy.py", line 267, in setup_mixins
ValueNetworkMixin.__init__(policy, obs_space, action_space, config)
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo_policy.py", line 239, in __init__
@make_tf_callable(self.get_session())
File "/home/neigh/miniconda3/envs/ray/lib/python3.7/site-packages/ray/rllib/utils/tf_ops.py", line 58, in make_tf_callable
assert session_or_none is not None
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (7 by maintainers)
Top Results From Across the Web
Tensorflow 2 eager execution disabled inside a custom layer
And I'm playing with a custom layer. import tensorflow as tf from tensorflow.keras.preprocessing import sequence from tensorflow.keras.layers ...
Read more >Text generation using a RNN with eager execution - Kaggle
Text generation using a RNN with eager execution. Python · Shakespeare ... An error occurred: Failed to fetch. navigate_nextminimize.
Read more >tf.compat.v1.enable_eager_execution | TensorFlow v2.11.0
Eager execution cannot be enabled after TensorFlow APIs have been used to create or execute graphs. It is typically recommended to invoke this ......
Read more >TensorFlow Eager Execution v.s. Graph (@tf.function)
This can be error-prone during deployment, in particular for NLP problems. Graph Mode Catches. However, there is a major catch for graph mode....
Read more >Inputs to eager execution function cannot be Keras symbolic ...
keras with TensorFlow 2.0. Below is my code This is working with TensorFlow 1.15 but getting the error in 2.0. you can check...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It seems to work fine when we don’t use tune, like e.g.:
Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.
Please feel free to reopen or open a new issue if you’d still like it to be addressed.
Again, you can always ask for help on our discussion forum or Ray’s public slack channel.
Thanks again for opening the issue!