question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Couldn't run the run_clip.py successfully

See original GitHub issue

System Info

I couldn’t run the code successfully following the README.md (https://github.com/huggingface/transformers/tree/main/examples/pytorch/contrastive-image-text#readme)。 “”" COCO_DIR = “data” ds = datasets.load_dataset(“ydshieh/coco_dataset_script”, “2017”, data_dir=COCO_DIR) “”" “”" python examples/pytorch/contrastive-image-text/run_clip.py
–output_dir ./clip-roberta-finetuned
–model_name_or_path ./clip-roberta
–data_dir ./data
–dataset_name ydshieh/coco_dataset_script
–dataset_config_name=2017
–image_column image_path
–caption_column caption
–remove_unused_columns=False
–do_train --do_eval
–per_device_train_batch_size=“64”
–per_device_eval_batch_size=“64”
–learning_rate=“5e-5” --warmup_steps=“0” --weight_decay 0.1
–overwrite_output_dir
–push_to_hub “”"

The errors are: FileNotFoundError: Couldn’t find file at https://huggingface.co/datasets/ydshieh/coco_dataset_script/resolve/main/data/train2017.zip

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

“”" COCO_DIR = “data” ds = datasets.load_dataset(“ydshieh/coco_dataset_script”, “2017”, data_dir=COCO_DIR) “”" “”" python examples/pytorch/contrastive-image-text/run_clip.py
–output_dir ./clip-roberta-finetuned
–model_name_or_path ./clip-roberta
–data_dir ./data
–dataset_name ydshieh/coco_dataset_script
–dataset_config_name=2017
–image_column image_path
–caption_column caption
–remove_unused_columns=False
–do_train --do_eval
–per_device_train_batch_size=“64”
–per_device_eval_batch_size=“64”
–learning_rate=“5e-5” --warmup_steps=“0” --weight_decay 0.1
–overwrite_output_dir
–push_to_hub “”"

The errors are: FileNotFoundError: Couldn’t find file at https://huggingface.co/datasets/ydshieh/coco_dataset_script/resolve/main/data/train2017.zip

Expected behavior

run the code successfully

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:13 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
lchwhutcommented, Aug 24, 2022

The train size of “ds” i got is 80, but the real size is at least greater than 20,000. As you see, the space occupied by the train2012.zip is 19GB

0reactions
ydshiehcommented, Aug 25, 2022

@lchwhut Thank you for the detailed information. Glad it works for you now. It’s probably good for me to make a comment on my dataset page regarding this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Struggling with a portable version again,... · Issue #7 - GitHub
Successfully installed pip-21.1.2 setuptools-57.0.0 wheel-0.36.2. I then called: i:\Vapoursynth\python.exe -m pip install --upgrade pip.
Read more >
Using PyQGIS processing runalg clipvectorsbypolygon
yes. I want to iterate over all features in the output layer and do some calculations. I think a memory output is enough....
Read more >
WONDER PARK | "Commencing Test Run" Clip - YouTube
Now on Digital, Blu-ray™& DVDGet it now: https://paramnt.us/WonderParkOfficialSiteBuckle up for an epic adventure where anything is possible ...
Read more >
VSGAN - VapourSynth GAN Implementation, based on ...
Just wanted to let you know that the __init__.py file installed by pip and by the zip ... clip = vsgan_device.run(clip=clip, chunk=True)
Read more >
Try Stable Diffusion's Img2Img Mode - Hacker News
If you have a GPU with >4GB of VRAM and you want to run this locally, ... I set the python process to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found