Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

run_clip.py RuntimeError

See original GitHub issue

System Info

transformers version: 4.22.0.dev0
Platform: Linux-3.10.0-957.el7.x86_64-x86_64-with-glibc2.17
Python version: 3.9.12
Huggingface_hub version: 0.8.1
PyTorch version (GPU?): 1.12.0+cu102 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: <fill in>
Using distributed or parallel set-up in script?: <fill in>

Who can help?

Hi, @patil-suraj When I run run_clip.py following the steps in the README, I get an error like the following:

[INFO|trainer.py:2644] 2022-08-02 04:07:15,699 >> Saving model checkpoint to clip-roberta-finetuned/checkpoint-4500
[INFO|configuration_utils.py:446] 2022-08-02 04:07:15,701 >> Configuration saved in clip-roberta-finetuned/checkpoint-4500/config.json
[INFO|modeling_utils.py:1567] 2022-08-02 04:07:17,602 >> Model weights saved in clip-roberta-finetuned/checkpoint-4500/pytorch_model.bin
/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
  warnings.warn('Was asked to gather along dimension 0, but all '
 33%|███████████████████████████████████████████████████████████████████████████████▉                                                                                                                                                                | 4623/13872 [1:56:27<3:50:22,  1.49s/it]Traceback (most recent call last):
  File "/home/gsj/transformers/examples/pytorch/contrastive-image-text/run_clip.py", line 537, in <module>
    main()
  File "/home/gsj/transformers/examples/pytorch/contrastive-image-text/run_clip.py", line 508, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/transformers/trainer.py", line 1502, in train
    return inner_training_loop(
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/transformers/trainer.py", line 1744, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/transformers/trainer.py", line 2474, in training_step
    loss = self.compute_loss(model, inputs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/transformers/trainer.py", line 2506, in compute_loss
    outputs = model(**inputs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward
    return self.gather(outputs, self.output_device)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 181, in gather
    return gather(outputs, output_device, dim=self.dim)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py", line 78, in gather
    res = gather_map(outputs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py", line 69, in gather_map
    return type(out)((k, gather_map([d[k] for d in outputs]))
  File "<string>", line 10, in __init__
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/transformers/utils/generic.py", line 188, in __post_init__
    for element in iterator:
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py", line 69, in <genexpr>
    return type(out)((k, gather_map([d[k] for d in outputs]))
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return Gather.apply(target_device, dim, *outputs)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/_functions.py", line 75, in forward
    return comm.gather(inputs, ctx.dim, ctx.target_device)
  File "/root/anaconda3/envs/h-transformers/lib/python3.9/site-packages/torch/nn/parallel/comm.py", line 235, in gather
    return torch._C._gather(tensors, dim, destination)
RuntimeError: Input tensor at index 1 has invalid shape [4, 4], but expected [4, 5]

How to solve this error. Thanks!