the correspondence about tags or labels
See original GitHub issueHi,
I am quite confused about the tags in training files and resulting tei files in prediction phase. Tags used for annotation seems not the same to the tagging label, they have a mapping relationship but I fail to find that. Such as in the tagginglabels file, there is <section>, <paragraph>
tags but in result files only<head>, <p>
can be found. I wonder where the transformation takes place? I’ve checked the saxparser and parser files, but I am still quite confused.
Thanks in advance.
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Where can I find the correspondence between the tag name in ...
xml tag names are column names (not column labels). Column names can be viewed in the dictionary. Right click on the list header...
Read more >Element: Correspondence Information - Journal Article Tag Suite
Information concerning how and with whom to correspond about an article. Remarks. A cross-reference element (<xref>) may point to this element's @id attribute ......
Read more >SOLVED: A scale whose numbers serve only as labels or tags ...
A scale whose numbers serve only as labels or tags for identifying and classifying objects with a strict one-to-one correspondence between ...
Read more >Correspondence Label, 4" x 2-58"
Purchase Correspondence Labels, 4" x 2-58" from our assortment of labels and stickers at United Ad Label.
Read more >Creative Correspondence - labels and tags | This was a priva…
Creative Correspondence - labels and tags. This was a private swap. Our guideline was to NOT spend a dime but rather have fun...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hello !
Training data are in TEI XML (flat TEI preserving stream order of the PDF), there is a mapping between TEI XML tag and labels used by the models in the SAX parsers corresponding to the model in
grobid-trainer
Then the final results are serialized into a complex TEI XML (normalised order and more deeply structured). So there is also a mapping between the labels used by the models into this TEI, which is mainly done in the file TEIFormatter, and for substructures like date, person, etc. directly by the POJO classes under
org.grobid.core.data
.Hope it makes thing clearer ! If not, don’t hesitate to ask more.
Hi again @Punchwes ! I am closing this issue which was about xml tag/crf label correspondence.
I open a separate one about level and number of section header, to keep track of improvement on this aspect.