Allow namespace prefixes other than 'None' in PageXML
See original GitHub issueEynollah, e.g., produces PageXML files that use an explicit prefix (xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"
).
calamari_ocr/ocr/dataset/datareader/pagexml/reader.py
, however, expects the prefix to be ‘None’ and throws an error when processing an eynollah pagexml.
When I change line 120 of reader.py
from
ns = {"ns": root.nsmap[None]}
to
ns = {'ns' : 'http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15'}
it works. I’m not sure if you can generalize the namespace dictionary to cover both output styles. Maybe xpath’s local-name
function (instead of lxml find
or findall
) is an alternative.
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (17 by maintainers)
Top Results From Across the Web
xml - PHP - Include DOMDocumentFragment with undefined ...
1 Answer 1 ... If you don't define the namespaces inside the fragment string then it will not be valid XML. Namespace prefixes...
Read more >Namespaces in XML - W3C
URIs can contain characters not allowed in names, and so cannot be used directly as namespace prefixes. Therefore, the namespace prefix ...
Read more >XML Namespaces - CDuce
In XML, the bindings from prefixes to namespace URIs are introduction through special xmlns:prefix attributes. In CDuce, instead, there are explicit ...
Read more >The name "xml" is not legal for JDOM/XML Namespace prefixs
All of a sudden, we're getting this error on all of our recently ... is not legal for JDOM/XML Namespace prefixs: Namespace prefixes...
Read more >man page XML::Parser::Expat section 3
Other encodings may be used if they have encoding maps in one of the ... from element and attributes names where those prefixes...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Okay, just wanted to “clarify” the most obvious mistake. Luckily, I receive the same error using your files. I will examine this and hopefully find the reason!
Thanks for testing @alexander-winkler . I will close this and merge #259