Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`return_loss=True` in call for `TFCLIPModel` bugs out.

See original GitHub issue

System Info

transformers version: 4.23.1
Platform: Linux-5.10.133±x86_64-with-Ubuntu-18.04-bionic
Python version: 3.7.15
Huggingface_hub version: 0.10.1
PyTorch version (GPU?): 1.12.1+cu113 (False)
Tensorflow version (GPU?): 2.9.2 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@patil-suraj

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

To reproduce the bug I have used the following code snippet 👇

import tensorflow as tf
from PIL import Image
import requests
from transformers import CLIPProcessor, TFCLIPModel

model = TFCLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(
    text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="tf", padding=True
)

outputs = model(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    attention_mask=inputs["attention_mask"],
    return_loss=True,
    return_dict=True,
)

Expected behavior

The call should execute and we should obtain the outputs.

Issue Analytics

State:
Created a year ago
Comments:14 (13 by maintainers)

Top GitHub Comments

1reaction

sguggercommented, Nov 21, 2022

It looks like the problem in this issue is that you are not passing along as many images as texts. Passing images=[image, image] makes your reproducer pass.

0reactions

raghavanonecommented, Dec 20, 2022

@ArthurZucker oh, great, let me look at the fix. Last time I checked the way contrastive loss was flawed.