Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A2C derivative for mean is not implemented

See original GitHub issue

2019-11-12 11:18:32.850181  | a2c_pong_0 Optimizing over 1250 iterations.
Traceback (most recent call last):
  File "example_3.py", line 80, in <module>
    n_parallel=args.n_parallel,
  File "example_3.py", line 62, in build_and_train
    runner.train()
  File "//rlpyt/rlpyt/runners/minibatch_rl.py", line 196, in train
    opt_info = self.algo.optimize_agent(itr, samples)
  File "//rlpyt/rlpyt/algos/pg/a2c.py", line 38, in optimize_agent
    loss.backward()
  File "//lib/python3.7/site-packages/torch/tensor.py", line 166, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "//lib/python3.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: derivative for mean is not implemented

Issue Analytics

State:
Created 4 years ago
Comments:9 (3 by maintainers)

Top GitHub Comments

1reaction

tarungogcommented, Dec 21, 2019

https://github.com/pytorch/pytorch/pull/29199

It’s a pytorch bug that’s fixed on master only recently

0reactions

astookecommented, Feb 26, 2020

OK if I understand correctly, this is a pytorch issue and no change is needed in rlpyt. Closing, but please reopen if that’s wrong.

Top Results From Across the Web

Understanding Actor Critic Methods and A2C | by Chris Yoon

In my previous post, we derived policy gradients and implemented the REINFORCE algorithm (also known as Monte Carlo policy gradients).

the derivative for 'target' is not implemented - Stack Overflow

I added two VAEs to the original model, so I need to add optimizer and loss. However, the following errors are reported. How...

An intro to Advantage Actor Critic methods: let's play Sonic the ...

We have two different strategies to implement an Actor Critic agent: A2C (aka Advantage Actor Critic); A3C (aka Asynchronous Advantage Actor ...

pytorch/derivatives.yaml at master - GitHub

Defines derivative formulas and Python signatures of methods on Variable. #. # Note about possibly confusing nomenclature: An 'output gradient' is the.

Mean value theorem (video) | Khan Academy

But the MVT is talking about a ordinary derivative, not a one-sided derivative. Thus, x=c must be on the open interval (a,b). There...