Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

the correspondence about tags or labels

See original GitHub issue

Hi,

I am quite confused about the tags in training files and resulting tei files in prediction phase. Tags used for annotation seems not the same to the tagging label, they have a mapping relationship but I fail to find that. Such as in the tagginglabels file, there is <section>, <paragraph> tags but in result files only<head>, <p>can be found. I wonder where the transformation takes place? I’ve checked the saxparser and parser files, but I am still quite confused.

Thanks in advance.

Issue Analytics

State:
Created 5 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

kermitt2commented, Dec 29, 2018

Hello !

Training data are in TEI XML (flat TEI preserving stream order of the PDF), there is a mapping between TEI XML tag and labels used by the models in the SAX parsers corresponding to the model in grobid-trainer

Then the final results are serialized into a complex TEI XML (normalised order and more deeply structured). So there is also a mapping between the labels used by the models into this TEI, which is mainly done in the file TEIFormatter, and for substructures like date, person, etc. directly by the POJO classes under org.grobid.core.data.

Hope it makes thing clearer ! If not, don’t hesitate to ask more.

0reactions

kermitt2commented, Jan 23, 2019

Hi again @Punchwes ! I am closing this issue which was about xml tag/crf label correspondence.

I open a separate one about level and number of section header, to keep track of improvement on this aspect.

Top Results From Across the Web

Where can I find the correspondence between the tag name in ...

xml tag names are column names (not column labels). Column names can be viewed in the dictionary. Right click on the list header...

Element: Correspondence Information - Journal Article Tag Suite

Information concerning how and with whom to correspond about an article. Remarks. A cross-reference element (<xref>) may point to this element's @id attribute ......

SOLVED: A scale whose numbers serve only as labels or tags ...

A scale whose numbers serve only as labels or tags for identifying and classifying objects with a strict one-to-one correspondence between ...

Correspondence Label, 4" x 2-58"

Purchase Correspondence Labels, 4" x 2-58" from our assortment of labels and stickers at United Ad Label.

Creative Correspondence - labels and tags | This was a priva…

Creative Correspondence - labels and tags. This was a private swap. Our guideline was to NOT spend a dime but rather have fun...