question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TFClipModel fails to train because of None loss

See original GitHub issue

System Info

transformers version: 4.21.1 Platform: MacOS BigSur 11.6.7 Python version: 3.8.13 Huggingface_hub version: 0.8.1 Tensorflow version (GPU?): 2.7.3 (False) Flax version (CPU?/GPU?/TPU?): not installed (NA) Jax version: not installed JaxLib version: not installed Using GPU in script?: no Using distributed or parallel set-up in script?: no

@patil-suraj

Who can help?

@patil-suraj

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

This is the script run to attempt to fit the model to the example data. It is verbatim from the 4.21.1 docs with the addition of model.fit. The same error arose when working with my own project. The loss is always None as is y y_pred . Somewhere in the logic of https://github.com/huggingface/transformers/blob/132402d752044301b37e54405832738b16f49df6/src/transformers/modeling_tf_utils.py#L1116.

from PIL import Image
import requests
from transformers import CLIPProcessor, TFCLIPModel
import tensorflow as tf

model = TFCLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(
    text=["a photo of a cat", "a photo of a dog"], images=[image, image], return_tensors="tf", padding=True
)

outputs = model(**inputs)

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001))
model.fit(dict(inputs))

Results with an error of zero gradient because the gradients are all 0s, which I expect is caused by y and y_pred both being empty dicts.

Expected behavior

Model.fit() on inputs from preprocessor completes a training step without error.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:14 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
Rocketknight1commented, Aug 18, 2022

@taymills No problem! Fixing the tests has exposed a few other issues though, which that PR will need to fix as well. Unfortunately, you’re stuck in the PR branch for now, but I’ll ping you and close this issue when it’s merged to main!

1reaction
Rocketknight1commented, Aug 18, 2022

@taymills yes, that’s part of this PR! When using the built-in loss, we now force return_loss=True for models where it is an argument. That should avoid this for CLIP and for other similar models in future.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CLIP - Hugging Face
The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific...
Read more >
Huggingface transformers) training loss sometimes decreases ...
I think this is quite weird because it seems learned something but eval_loss doesn't change while training. Does 'transformers.Trainer' select ...
Read more >
Pretrain Transformers Models in PyTorch Using Hugging Face ...
Train a transformer model to use it as a pretrained transformers model which ... are fine-tuned using a masked language modeling (MLM) loss....
Read more >
transformers_example — Ray 2.2.0 - the Ray documentation
choices=list(task_to_keys.keys()), ) parser.add_argument( "--train_file", type=str, default=None, help="A csv or a json file containing the training data.
Read more >
A complete Hugging Face tutorial: how to build and train a ...
Typically, the dataset will be returned as a datasets.Dataset object which is nothing more than a table with rows and columns. Querying a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found