Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'layoutlmv2' in AutoTokenizer

See original GitHub issue

I get key error, when I try to run AutoTokenizer.from_pretrained("microsoft/layoutlmv2-base-uncased") same is the case even when I download the files to local and run the above with path to the config folder

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation.

Any leads on how to use it the right way would be helpful

Issue Analytics

State:
Created 2 years ago
Comments:12

Top GitHub Comments

1reaction

nkrotcommented, Jun 14, 2021

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation

this is because it is not in transformers but in layoutlm. the latter adds new symbols in layoutlmft/layoutlmft/__init__.py.

The snippet below worked for me. Well, at least it did not produce an error. And I did not try to run the tokenizer itself 😃

import layoutlmft
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('microsoft/layoutlmv2-base-uncased')

I have to say, the library is kind of shit in terms of documentation. I am learning it by trial and error, by disassembling the code. the script run_funsd.py and grep are very handy 😃

Good luck in your exploration!

0reactions

sindhuattaigercommented, Aug 6, 2021

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation

this is because it is not in transformers but in layoutlm. the latter adds new symbols in layoutlmft/layoutlmft/__init__.py.

The snippet below worked for me. Well, at least it did not produce an error. And I did not try to run the tokenizer itself 😃
import layoutlmft
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('microsoft/layoutlmv2-base-uncased')
I have to say, the library is kind of shit in terms of documentation. I am learning it by trial and error, by disassembling the code. the script run_funsd.py and grep are very handy 😃

Good luck in your exploration!

This solved the issue

Top Results From Across the Web

KeyError when using non-default models in Huggingface ...

It throws a KeyError. nlp = pipeline('sentiment-analysis', tokenizer = AutoTokenizer.from_pretrained("DeepPavlov/bert-base-cased- ...

KeyError when using AutoTokenizer for facebook/detr-resnet

Environment info transformers version: 4.16.0 Platform: Ubuntu 20.04 Python version: 3.8.12 PyTorch version (GPU?)

LayoutLMV2 - Hugging Face

In this paper, we present LayoutLMv2 by pre-training text, layout and image in a multi-modal framework, where new model architectures and pre-training tasks...

[LayoutLMv2] TokenClassifier on CORD - Kaggle

In this notebook, we are going to fine-tune LayoutLMv2 For TokenClassification on the CORD dataset. The goal for the model is to label...

Hugging Face LayoutLMv2 Model True Inference - YouTube

I explain why OCR quality matters for Hugging Face LayoutLMv2 model performance, related to document data classification.