question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: Tokenizer class T5Tokenizer does not exist or is not currently imported.

See original GitHub issue

@mfuntowicz

Environment info

  • transformers version: Latest transformers==4.2.0.dev0
  • Platform: Colab
  • Python version: Python 3.6.9
  • PyTorch version (GPU?): torch==1.7.0+cu101
  • Tensorflow version (GPU?):
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

@mfuntowicz

Information

The following code indicated in the latest HF news letter seems to have isssues when I tried I get tokenizer error both under Fast and Slow (True/Flase tokenizer parameter) conditions when I had checked

The problem arises when using:

  • the official example scripts: (give details below)
  • [ ]
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM 

tokenizer = AutoTokenizer.from_pretrained("mrm8488/mT5-small-finetuned-tydiqa-for-xqa",use_fast=False )

model = AutoModelForSeq2SeqLM.from_pretrained("mrm8488/mT5-small-finetuned-tydiqa-for-xqa")

context = "HuggingFace won the best Demo paper at EMNLP2020."
question = "What won HuggingFace?"
input_text = 'question: %s context: %s' % (question, context)
features = tokenizer([input_text], return_tensors='pt')
output = model.generate(**features)
tokenizer.decode(output[0])

To reproduce

Steps to reproduce the behavior:

  1. Run the above code on Google Colab

ERROR reported

`ValueError Traceback (most recent call last) <ipython-input-3-87256159791c> in <module>() 10 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM 11 —> 12 tokenizer = AutoTokenizer.from_pretrained(“mrm8488/mT5-small-finetuned-tydiqa-for-xqa”,use_fast=False ) 13 14 model = AutoModelForSeq2SeqLM.from_pretrained(“mrm8488/mT5-small-finetuned-tydiqa-for-xqa”)

/usr/local/lib/python3.6/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 358 if tokenizer_class is None: 359 raise ValueError( –> 360 “Tokenizer class {} does not exist or is not currently imported.”.format(tokenizer_class_candidate) 361 ) 362 return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)

ValueError: Tokenizer class T5Tokenizer does not exist or is not currently imported.`

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

11reactions
gingsicommented, Nov 30, 2021

I had a similar problem ValueError: Tokenizer class M2M100Tokenizer does not exist or is not currently imported. and solved it by running pip install sentencepiece

Seems that when missing the sentencepiece package, AutoTokenizer.from_pretrained will silently not load the tokenizer and then crash later.

4reactions
johnpaulbincommented, Dec 14, 2021

I had a similar problem ValueError: Tokenizer class M2M100Tokenizer does not exist or is not currently imported. and solved it by running pip install sentencepiece

Seems that when missing the sentencepiece package, AutoTokenizer.from_pretrained will silently not load the tokenizer and then crash later.

This works fabulously with DeBerta models as well, seems that the error isn’t very descriptive.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ValueError: Tokenizer class ByT5Tokenizer does not exist or ...
ValueError : Tokenizer class ByT5Tokenizer does not exist or is not currently imported ... Docs here suggest to use tokenizer for padding, and...
Read more >
ValueError: Tokenizer class MarianTokenizer does not exist or ...
You need to install the latest transformers-4.20.1.dev0. The reason for this problem is that the wrong version of the library is installed.
Read more >
Add nllb support - Feature Requests - OpenNMT Forum
ValueError : Tokenizer class NllbTokenizer does not exist or is not currently imported. Here is the link of the model: ...
Read more >
日本語GPT-2で 'Tokenizer class T5Tokenizer does not exist or ...
ValueError : Tokenizer class T5Tokenizer does not exist or is not currently imported. んなこと言われても...と思い、色々やりました。
Read more >
Pytorch Transformer Language Model. 首先是最简单的使用 ...
Huggingface transformers是一个nlp领域提供了丰富预训练模型、支持各种nlp任务的nlp库. from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = …
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found