Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
See original GitHub issueEnvironment info
transformers
version: 4.11.3- Platform: Linux-5.4.0-84-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- PyTorch version (GPU?): 1.8.1+cu111 (True)
- Tensorflow version (GPU?): 2.5.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.3.4 (cpu)
- Jax version: 0.2.20
- JaxLib version: 0.1.71
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Yes (Multi GPU setting)
Who can help
@patrickvonplaten , @patil-suraj
Information
Model I am using LED:
The problem arises when using:
the official example scripts Finetuneing Longformer Encoder-Decoder (LED)
To reproduce
Steps to reproduce the behavior:
- Execute the above script on 2 gpu setting.
../lib/python3.8/site-packages/torch/nn/parallel/_functions.py:65: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all
It seems to be a warning, but the program stops after that & doesn’t execute further.
Expected behavior
The program should complete the execution of epochs.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:5
- Comments:9 (4 by maintainers)
Top Results From Across the Web
How to fix gathering dim 0 warning in multi-gpu (DataParallel ...
py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. alex....
Read more >"Was asked to gather along dimension 0, but all input tensors ...
py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. I ......
Read more >Mini Project IE0005 - Kaggle
Time # Log Message
18.7s 3 Try using .loc = value instead
18.7s 4
18.7s 6 """Entry point for launching an IPython kernel.
Read more >My code does not run on multiple GPUs in PyTorch
... UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
Read more >How to squeeze and unsqueeze a tensor in PyTorch?
It returns a new tensor with all the dimensions of the input tensor but removes size 1. For example, if the shape of...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
To continue, could you maybe do the following:
a) Verify that the code works with single-GPU (set
CUDA_VISIBLE_DEVICES="0"
). b) Try to run the script with DDP. The Trainer supports DDP out-of-the-box. See: https://github.com/huggingface/transformers/tree/master/examples/pytorch#distributed-training-and-mixed-precisionLet me know if you still run into problems 😃
Sorry, could anyone clarify what should be done when using the trainer? I have the same problem as rajgar114, but I am using the Huggingface Trainer and the program stops and does not execute further.