question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to log the gradient and weights of the actor policy in TD3 using tensorboard?

See original GitHub issue

Code snippet:

import tensorflow as tf
def Callback(_locals, _globals):
    self_ = _locals['self']
    tf.summary.histogram("Actor Network Weights Histogram", self_.policy_out.policy.??? )

I cannot understand from the

class FeedForwardPolicy(TD3Policy):

in FeedForwardPolicy(TD3Policy)

how to get the the trainable variables for the network, something like tf.trainable_variables() used to plot historgams in tensorboard.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
araffincommented, Oct 28, 2019

Hello,

You should take a look at the code of TD3, not the policy, that’s where we use tf.trainable_variables(). (I recommend you to take a look at PPO2 too, where we log more things using tensorboard) You can get the name of the weights (scope + name) using model.get_parameter_list() (cf doc)

Since https://github.com/hill-a/stable-baselines/issues/409, we also added some documentation on how to log additional variables.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Artificial Intelligence Learns to Walk with Actor Critic Deep ...
Twin Delayed Deep Deterministic Policy Gradients ( TD3 ) is a state of the art actor critic algorithm for mastering environments with ......
Read more >
TensorFlow 2.x Implementation For DDPG and TD3
In this article, we will be implementing Deep Deterministic Policy Gradient and Twin Delayed Deep Deterministic Policy Gradient methods with ...
Read more >
Twin Delayed Deep Deterministic Policy Gradient(TD3) in ...
We will update both online critic networks' weights using the calculated loss value. with tf.GradientTape(persistent=True) as tape: target_actions = self.
Read more >
Twin Delayed Deep Deterministic Policy Gradient (TD3)
Running python cleanrl/td3_continuous_action.py will automatically record various metrics such as various losses in Tensorboard. Below are the documentation for ...
Read more >
Policy Gradient Algorithms | Lil'Log
[Updated on 2018-09-30: add a new policy gradient method, TD3.] ... we can simple adjust it with a weighted sum and the weight...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found