ERROR! XML parser error DESCRIPTION.en_us.html
See original GitHub issueObserved behaviour
Here’s a report from the (current) Font Bakery Dashboard version of Font Bakery.
https://fontbakery.graphicore.de/report/de0aff61-4456-448f-b51f-b17c7e03b23f
ERROR com.google.fonts/check/description/broken_links Does DESCRIPTION file contain broken links?
ERROR
Failed with XMLSyntaxError: AttValue: " or ' expected, line 11, column 61 (<string>, line 11)
traceback
File "/usr/local/lib/python3.7/dist-packages/fontbakery/checkrunner.py", line 344, in _exec_check
for sub_result in result: # Might raise.
File "/usr/local/lib/python3.7/dist-packages/fontbakery/profiles/googlefonts.py", line 245, in com_google_fonts_check_description_broken_links
doc = etree.fromstring("<html>" + description + "</html>")
File "src/lxml/etree.pyx", line 3234, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
and, same error different check:
com.google.fonts/check/description/git_url Does DESCRIPTION file contain a upstream Git repo URL?
ERROR
Failed with XMLSyntaxError: AttValue: " or ' expected, line 11, column 61 (<string>, line 11)
traceback
File "/usr/local/lib/python3.7/dist-packages/fontbakery/checkrunner.py", line 344, in _exec_check
for sub_result in result: # Might raise.
File "/usr/local/lib/python3.7/dist-packages/fontbakery/profiles/googlefonts.py", line 301, in com_google_fonts_check_description_git_url
doc = etree.fromstring("<html>" + description + "</html>")
File "src/lxml/etree.pyx", line 3234, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
The report is from this process/PR the files used with Font Bakery are here: DM Serif Display
looking into DESCRIPTION.en_us.html
we actually get a badly formatted attribute in line 11 position 61where the attribute is in the wrong kind of quotes:
The DM Serif project was commissoned by Google from
<a href=“https://colophon-foundry.org“>
Colophon …
That is “
(Left Double Quotation Mark, uni 201C )instead of "
(Quotation Mark uni0022) or '
(Apostrophe uni0027)
Expected behaviour
Maybe we need a new check DESCRIPTION is well formed
and detect parser errors with that, and/or all checks that parse this file directly should be aware of possible parser errors and FAIL accordingly.
Resources and exact process needed to replicate
I’m not sure why this didn’t surface earlier.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
Thaks, @felipesanches. It makes sense.
Since we already had com.google.fonts/check/description/valid_html, I decided to extend it to ensure the file parses correctly and also is only a snippet (not including the main
<html>
tag).I also updated its code-test routines and rationale description to reflect the additional requirements.