LayoutLMv2 is added to HuggingFace Transformers
See original GitHub issueHi,
I’ve added LayoutLMv2 and LayoutXLM to HuggingFace Transformers. I’ve also created several notebooks to fine-tune the model on custom data, as well as to use it for inference. Demo notebooks can be found here. I’ve split them up according to the different datasets: FUNSD, CORD, DocVQA and RVL-CDIP.
For now, you’ve got to install Transformers from master to use it:
pip install git+https://github.com/huggingface/transformers.git
The big difference with LayoutLM (v1) is that I’ve now also created a processor called LayoutLMv2Processor
. It takes care of all the preprocessing required for the model (i.e. you just give it an image and it returns input_ids
, attention_mask
, token_type_ids
, bbox
and image
). It uses Tesseract under the hood for OCR. You can also optionally provide your own words and boxes, if you prefer to use your own OCR. All documentation can be found here: https://huggingface.co/transformers/master/model_doc/layoutlmv2.html
Perhaps relevant to the following issues: #333 #335 #351 #329 #356
Issue Analytics
- State:
- Created 2 years ago
- Reactions:17
- Comments:13 (1 by maintainers)
Top GitHub Comments
Hello Niels, amazing work ! Out of curiosity, will you be adding the LayoutReader as well to the HF ecosystem ? If not, I’ll try to do it eventually but can’t guarantee I will have the time anytime soon.
Hello @lalitr994, At what part of the code did you manage for the confidence? Your help is appreciated.