"RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn" when using my loss-function
See original GitHub issue- PyTorch-Forecasting version: 0.9.0
- PyTorch version: 1.9.0
- Python version: 3.6
- Operating System: Windows10
Expected behavior
Hello!Appreciate for your brilliant work! When I using the Temporal_Fusion_Transformer I refer to the class “QuantileLoss(MultiHorizonMetric)” and modify the loss function to expect the model prediction results to be more accurate.
Actual behavior
tft = TemporalFusionTransformer.from_dataset(
training,
learning_rate=0.03,
hidden_size=16,
attention_head_size=1,
dropout=0.1,
hidden_continuous_size=8,
output_size=7, # 7 quantiles by default
loss=MyLoss(),
log_interval=10, # uncomment for learning rate finder and otherwise, e.g. to 10 for logging every 10 batches
reduce_on_plateau_patience=4,
)
However, the error is: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Code to reproduce the problem
The define of MyLoss() is:
class MyLoss(MultiHorizonMetric):
def __init__(
self,
quantiles: List[float] = [0.02, 0.1, 0.25, 0.5, 0.75, 0.9, 0.98],
**kwargs,
):
super().__init__(quantiles=quantiles, **kwargs)
def loss(self, y_pred: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
# calculate quantile loss
diff_2 = torch.zeros_like(target)
for n in range(target.size(0)):
for m in range(target.size(1)):
if m == 0 or m == (target.size(1) - 1):
diff_2[n][m] = 0
else:
diff_2[n][m] = torch.abs(target[n][m - 1] - 2 * target[n][m] + target[n][m + 1])
losses = []
for i, q in enumerate(self.quantiles):
mae = torch.abs(y_pred[..., i] - target) / target.size(1)
rmse = torch.sqrt(torch.pow(y_pred[..., i] - target, 2)) / target.size(1)
loss = q * rmse + (1 - q) * mae + 0.2 * torch.pow(diff_2, 2)
# loss = q * rmse + (1 - q) * mae
losses.append(loss.unsqueeze(-1))
losses = torch.cat(losses, dim=2)
return losses
def to_prediction(self, y_pred: torch.Tensor) -> torch.Tensor:
"""
Convert network prediction into a point prediction.
Args:
y_pred: prediction output of network
Returns:
torch.Tensor: point prediction
"""
if y_pred.ndim == 3:
idx = self.quantiles.index(0.5)
y_pred = y_pred[..., idx]
return y_pred
def to_quantiles(self, y_pred: torch.Tensor) -> torch.Tensor:
"""
Convert network prediction into a quantile prediction.
Args:
y_pred: prediction output of network
Returns:
torch.Tensor: prediction quantiles
"""
return y_pred
And the error is:
File "F:/TimothyLiu/ICONIP 2021/TFI/trian_TFT.py", line 197, in <module>
val_dataloaders=val_dataloader,
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 458, in fit
self._run(model)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 756, in _run
self.dispatch()
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 797, in dispatch
self.accelerator.start_training(self)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 807, in run_stage
return self.run_train()
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 869, in run_train
self.train_loop.run_training_epoch()
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 499, in run_training_epoch
batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 738, in run_training_batch
self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 442, in optimizer_step
using_lbfgs=is_lbfgs,
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\core\lightning.py", line 1403, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\core\optimizer.py", line 214, in step
self.__optimizer_step(*args, closure=closure, profiler_name=profiler_name, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\core\optimizer.py", line 134, in __optimizer_step
trainer.accelerator.optimizer_step(optimizer, self._optimizer_idx, lambda_closure=closure, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 329, in optimizer_step
self.run_optimizer_step(optimizer, opt_idx, lambda_closure, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 336, in run_optimizer_step
self.training_type_plugin.optimizer_step(optimizer, lambda_closure=lambda_closure, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 193, in optimizer_step
optimizer.step(closure=lambda_closure, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\torch\optim\optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_forecasting\optim.py", line 131, in step
_ = closure()
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 733, in train_step_and_backward_closure
split_batch, batch_idx, opt_idx, optimizer, self.trainer.hiddens
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 836, in training_step_and_backward
self.backward(result, optimizer, opt_idx)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 870, in backward
result.closure_loss, optimizer, opt_idx, should_accumulate, *args, **kwargs
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 309, in backward
self.lightning_module, closure_loss, optimizer, optimizer_idx, should_accumulate, *args, **kwargs
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\plugins\precision\precision_plugin.py", line 79, in backward
model.backward(closure_loss, optimizer, opt_idx)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\pytorch_lightning\core\lightning.py", line 1275, in backward
loss.backward(*args, **kwargs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\torch\_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "D:\DeepLearning\Anaconda3\envs\pytorch-transformer\lib\site-packages\torch\autograd\__init__.py", line 149, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (1 by maintainers)
Top Results From Across the Web
RuntimeError: element 0 of variables does not require grad ...
hi, i have a problem here, i got a sequence of Variables which are the outputs of the bi-directional RNN, and i stacked...
Read more >Pytorch RuntimeError: element 0 of tensors does not require ...
If you call .detach() on the prediction, that will delete the gradients. Since you are first getting indices from the model and then...
Read more >element 0 of tensors does not require grad and does ... - GitHub
It may be caused by "with torch.no_grad()" which stops the tracking of gradients so that there is no gradient information passed to grad...
Read more >Pytorch Autograd torch.autograd inDepth-Beginners - Kaggle
Receiving dL/dz, the gradient of the loss function with respect to z from above, ... RuntimeError: element 0 of tensors does not require...
Read more >Lesson 13 RuntimeError does not have a grad_fn
When I run the Lesson 13 notebook on pytorch 0.4 with windows 10. Got error: RuntimeError: element 0 of variables does not require...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Are you using multiple GPUs? For what it’s worth – I’m also hitting this error, but only when using multiple GPUs and multiple targets.
I’ve ensured there’s no nulls in my dataset, values are normalized, low learning rate with clipped gradients to reduce instability. Here’s what I’m noticing:
#908