Cannot instantiate TokenizerSee original GitHub issue
I am using Huggingface Transformers 4.0.0. When I instantiate the autotokenizer for indicbert, I get the following issue:
tokenizer = AutoTokenizer.from_pretrained('ai4bharat/indic-bert')
Couldn’t instantiate the backend tokenizer from one of: (1) a
tokenizers library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
- Created 2 years ago
- Comments:12 (4 by maintainers)
Top GitHub Comments
Hi, I’m also having this problem. Trying to instantiate
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-nl", use_fast=False)
but I get “ValueError: This tokenizer cannot be instantiated. Please make sure you have
sentencepiece installed in order to use this tokenizer.”
But I have already installed sentencepiece. I have:
- pip: 20.3.3
- sentencepiece: 0.1.94
- transformers: 4.1.1
The above code snippet with “Musixmatch/umberto-wikipedia-uncased-v1” also doesn’t work for me.
Anyone have more ideas?