grobid will make mistake when a reference has square brackets.
See original GitHub issuePaper:
Reference:
TEI Result:
...
<biblStruct coords="13,326.72,602.79,229.86,5.73;13,326.72,610.79,230.88,5.73" xml:id="b160">
<monogr>
<title level="m" type="main">Click" reactions for the N-terminal and side-chain functionalization of peptides with</title>
<author>
<persName coords=""><forename type="first">H</forename><surname>Pfeiffer</surname></persName>
</author>
<author>
<persName coords=""><forename type="first">A</forename><surname>Rojas</surname></persName>
</author>
<author>
<persName coords=""><forename type="first">J</forename><surname>Niesel</surname></persName>
</author>
<author>
<persName coords=""><forename type="first">U</forename><surname>Schatzschneider</surname></persName>
</author>
<author>
<persName coords=""><surname>Sonogashira</surname></persName>
</author>
<imprint>
<pubPlace>Mn(CO)</pubPlace>
</imprint>
</monogr>
<note type="raw_reference">H. Pfeiffer, A. Rojas, J. Niesel, U. Schatzschneider, Sonogashira and "Click" reac- tions for the N-terminal and side-chain functionalization of peptides with [Mn(CO)</note>
</biblStruct>
<biblStruct coords="13,350.04,618.72,207.57,5.73;13,326.72,626.72,74.63,5.73" xml:id="b161">
<monogr>
<title level="m" type="main">+-based CO releasing molecules (tpm = tris(pyrazolyl)methane)</title>
<imprint>
<date type="published" when="2009" />
<biblScope unit="page" from="4292" to="4298" />
<pubPlace>Dalton Trans</pubPlace>
</imprint>
</monogr>
<note type="raw_reference">+-based CO releasing molecules (tpm = tris(pyrazolyl)methane), Dalton Trans. (2009) 4292-4298.</note>
</biblStruct>
...
Description:
Seems grobid will make mistake when a reference has square brackets.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
Retraining: Introduce new element tag for incremental training in ...
Hi, I am having question on bibliography reference incremental training in GROBID. Is it is possible to add extra newly named tags inside...
Read more >Citation styles with numbers in brackets [5 examples] - BibGuru
Each reference must be numbered consecutively (in square brackets) in order of citation, and added to a list at the end of the...
Read more >Machine Learning vs. Rules and Out-of-the-Box vs. Retrained
GROBID, created by Lopez [27], is another example of a. CRF-based system able to parse bibliographic references. GROBID is also a larger tool,...
Read more >Hybrid extraction of in‐text patent‐to‐article citations
Combining hand‐tuned heuristics and the GROBID machine‐learning package, ... An advantage of rule‐based heuristics is that they can be ...
Read more >The explanatory power of citations: a new approach to ...
Citation analysis has been applied to map the landscape of ... Doing away with this assumption could make such studies even more insightful....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @elonzh !
Unfortunately the two articles given as examples cannot be used as training data - first one is CC BY but non-derivative and the second one is closed access.
If you have the chance to find similar errors in one or two CC-BY articles, don’t hesitate to reference it here 😃
Another error case:
Paper:
[math/0506081] The Dantzig selector: Statistical estimation when $p$ is much larger than $n$
Reference:
I detect reference errors by clustering alignments, and this algorithm is context-free and works great if the page is well-formatted.
Maybe we can integrate the algorithm into Grobid?