question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'layoutlmv2' in AutoTokenizer

See original GitHub issue

I get key error, when I try to run AutoTokenizer.from_pretrained("microsoft/layoutlmv2-base-uncased") same is the case even when I download the files to local and run the above with path to the config folder

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation.

Any leads on how to use it the right way would be helpful

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12

github_iconTop GitHub Comments

1reaction
nkrotcommented, Jun 14, 2021

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation

this is because it is not in transformers but in layoutlm. the latter adds new symbols in layoutlmft/layoutlmft/__init__.py.

The snippet below worked for me. Well, at least it did not produce an error. And I did not try to run the tokenizer itself 😃

import layoutlmft
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('microsoft/layoutlmv2-base-uncased')

I have to say, the library is kind of shit in terms of documentation. I am learning it by trial and error, by disassembling the code. the script run_funsd.py and grep are very handy 😃

Good luck in your exploration!

0reactions
sindhuattaigercommented, Aug 6, 2021

I also do not find layoutlmv2 in the AutoTokenizer.from_pretrained Documentation

this is because it is not in transformers but in layoutlm. the latter adds new symbols in layoutlmft/layoutlmft/__init__.py.

The snippet below worked for me. Well, at least it did not produce an error. And I did not try to run the tokenizer itself 😃

import layoutlmft
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('microsoft/layoutlmv2-base-uncased')

I have to say, the library is kind of shit in terms of documentation. I am learning it by trial and error, by disassembling the code. the script run_funsd.py and grep are very handy 😃

Good luck in your exploration!

This solved the issue

Read more comments on GitHub >

github_iconTop Results From Across the Web

KeyError when using non-default models in Huggingface ...
It throws a KeyError. nlp = pipeline('sentiment-analysis', tokenizer = AutoTokenizer.from_pretrained("DeepPavlov/bert-base-cased- ...
Read more >
KeyError when using AutoTokenizer for facebook/detr-resnet
Environment info transformers version: 4.16.0 Platform: Ubuntu 20.04 Python version: 3.8.12 PyTorch version (GPU?)
Read more >
LayoutLMV2 - Hugging Face
In this paper, we present LayoutLMv2 by pre-training text, layout and image in a multi-modal framework, where new model architectures and pre-training tasks...
Read more >
[LayoutLMv2] TokenClassifier on CORD - Kaggle
In this notebook, we are going to fine-tune LayoutLMv2 For TokenClassification on the CORD dataset. The goal for the model is to label...
Read more >
Hugging Face LayoutLMv2 Model True Inference - YouTube
I explain why OCR quality matters for Hugging Face LayoutLMv2 model performance, related to document data classification.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found