ValueError: only one element tensors can be converted to Python scalars
See original GitHub issue🐛 Bug
This happens in the training loop.
ValueError: only one element tensors can be converted to Python scalars
To Reproduce
From my observation, I believe this happens when the batch size can’t be divided by gpu num. For example on the last batch of each epoch, and when you have 4 gpus but set batch size to 2.
Additional context
I think it would be nice to use only some of the gpus the user specified, while printing out a msg telling them that the gpus are not specified correctly. Current implementaion simply throws a not friendly error
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:14 (13 by maintainers)
Top Results From Across the Web
ValueError: only one element tensors can be converted to ...
shape(). But trying to convert my list into an array, I got a ValueError: only one element tensors can be converted to Python...
Read more >only one element tensors can be converted to Python scalars
ValueError : only one element tensors can be converted to Python scalars. You try and convert the output of your model to a...
Read more >only one element tensors can be converted to Python scalars ...
This happens in the training loop. ValueError: only one element tensors can be converted to Python scalars ...
Read more >only one element tensors can be converted to python scalars ...
One of the main benefits of converting a tensor to a Python scalar is that it can make your code more concise and...
Read more >ValueError: only one element tensors can be converted to ...
ValueError : only one element tensors can be converted to Python scalars.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@Ir1d found the bug in trainer code. It does not reduce the outputs if output_size of training step does not equal num_gpus. I will make a PR to fix it.
Hi @Richarizardd I looked at your code and I found that in your validation epoch end, you don’t reduce the outputs properly. PL does not do this for you. This is intentional, right @williamFalcon ? So, in your
validation_epoch_end
, instead ofyou should do the following:
I tested this by adding it to your code and it worked (no error). As far as I can tell, this is not a bug in PL. However, we could print a better error message.
@Ir1d you probably had the same mistake.