Huggingface 3B and 11B models not configured properly
See original GitHub issueHello, it seems that the models are not properly configured on huggingface so it is not possible to download and use them using the given snippets in the readme. If you try to do so using the code snippet in the readme:
from transformers import AutoTokenizer, T5ForConditionalGeneration
model_name = "allenai/unifiedqa-t5-3b" # you can specify the model size here
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
The following error occurs:
OSError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
927 if resolved_archive_file is None:
--> 928 raise EnvironmentError
929 except EnvironmentError:
OSError:
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
1 frames
/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
933 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a file named one of {WEIGHTS_NAME}, {TF2_WEIGHTS_NAME}, {TF_WEIGHTS_NAME}.\n\n"
934 )
--> 935 raise EnvironmentError(msg)
936
937 if resolved_archive_file == archive_file:
OSError: Can't load weights for 'allenai/unifiedqa-t5-3b'. Make sure that:
- 'allenai/unifiedqa-t5-3b' is a correct model identifier listed on 'https://huggingface.co/models'
- or 'allenai/unifiedqa-t5-3b' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.
On huggingface, both the 3B model and the 11B do not seem to have the weights file when you list the model files, which is probably the cause of the issue. Is this a mistake or is it on purpose? Because the original T5-11B model has all the weight files on huggingface as expected.
PS: The large model example in the readme also seems to be mistyped, using allenai/unifiedqa-t5-large
instead of allenai/unifiedqa-large
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:10
Top Results From Across the Web
Troubleshoot - Hugging Face
Troubleshoot. Sometimes errors occur, but we are here to help! This guide covers some of the most common issues we've seen and how...
Read more >How to run t5-3b or t5-11b on Google Ai Notebook? - Models
Hey everyone, I'm curious to try either the 3B or even the big 11B T5 model (preferably in the pipeline) for summarization.
Read more >Source code for transformers.models.t5.modeling_t5
Example:: # On a 4 GPU machine with t5-3b: model ... with a config file does not load the weights associated with the...
Read more >A Gentle Introduction to 8-bit Matrix Multiplication for ...
With this blog post, we offer LLM.int8() integration for all Hugging Face models which we explain in more detail below.
Read more >mT5/T5v1.1 Fine-Tuning Results - Hugging Face Forums
are in the model hub Will upload the 3b and 11b versions in the ... task and found MT5ForConditionalGeneration not in transformers-3.5.1 yet ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Aha – I’m finally able to replicate (and solve) it.
Case 1 (Works): Using the official transformers 4.2.1 release:
peter@neutronium:~/github/transformers-t5-a100$ pip install transformers==4.2.1
Case 2 (doesn’t work): Using a very recent but not this-minute transformers clone (I think from the last 1-2 days):
peter@neutronium:~/github/transformers-t5-a100$ pip install .
Case 3 (works): Using a completely fresh clone made minutes ago (where things in seq2seq seem to have significantly changed):
peter@neutronium:~/github/transformers/examples/seq2seq$ python finetune_trainer.py --data_dir $XSUM_DIR --output_dir=xsum_results --num_train_epochs 1 --model_name_or_path allenai/unifiedqa-t5-11b
(this one produces a lot of output, but also starts downloading the model successfully).In summary: I have no idea what’s wonky about the pull I’ve been using from the last few days, but there seem to have been significant changes today, and it now fetches 11B successfully too.
Only in transformers can the library you’re using change significantly over hours… thanks for your help!
The issue should be resolved now, as per this conversation: https://github.com/huggingface/transformers/issues/8480 Let me know if you see any issue @Shamdan17.