question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ERROR! XML parser error DESCRIPTION.en_us.html

See original GitHub issue

Observed behaviour

Here’s a report from the (current) Font Bakery Dashboard version of Font Bakery.

https://fontbakery.graphicore.de/report/de0aff61-4456-448f-b51f-b17c7e03b23f

 ERROR com.google.fonts/check/description/broken_links Does DESCRIPTION file contain broken links? 

ERROR
Failed with XMLSyntaxError: AttValue: " or ' expected, line 11, column 61 (<string>, line 11)
traceback

  File "/usr/local/lib/python3.7/dist-packages/fontbakery/checkrunner.py", line 344, in _exec_check
    for sub_result in result:  # Might raise.
  File "/usr/local/lib/python3.7/dist-packages/fontbakery/profiles/googlefonts.py", line 245, in com_google_fonts_check_description_broken_links
    doc = etree.fromstring("<html>" + description + "</html>")
  File "src/lxml/etree.pyx", line 3234, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError

and, same error different check:

com.google.fonts/check/description/git_url Does DESCRIPTION file contain a upstream Git repo URL?

    ERROR
    Failed with XMLSyntaxError: AttValue: " or ' expected, line 11, column 61 (<string>, line 11)
    traceback

      File "/usr/local/lib/python3.7/dist-packages/fontbakery/checkrunner.py", line 344, in _exec_check
        for sub_result in result:  # Might raise.
      File "/usr/local/lib/python3.7/dist-packages/fontbakery/profiles/googlefonts.py", line 301, in com_google_fonts_check_description_git_url
        doc = etree.fromstring("<html>" + description + "</html>")
      File "src/lxml/etree.pyx", line 3234, in lxml.etree.fromstring
      File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
      File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
      File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
      File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
      File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
      File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError

The report is from this process/PR the files used with Font Bakery are here: DM Serif Display

looking into DESCRIPTION.en_us.html we actually get a badly formatted attribute in line 11 position 61where the attribute is in the wrong kind of quotes:

The DM Serif project was commissoned by Google from <a href=“https://colophon-foundry.org“>Colophon …

That is (Left Double Quotation Mark, uni 201C )instead of " (Quotation Mark uni0022) or ' (Apostrophe uni0027)

Expected behaviour

Maybe we need a new check DESCRIPTION is well formed and detect parser errors with that, and/or all checks that parse this file directly should be aware of possible parser errors and FAIL accordingly.

Resources and exact process needed to replicate

I’m not sure why this didn’t surface earlier.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
vv-monsalvecommented, Jun 25, 2020

Thaks, @felipesanches. It makes sense.

1reaction
felipesanchescommented, Jun 25, 2020

Since we already had com.google.fonts/check/description/valid_html, I decided to extend it to ensure the file parses correctly and also is only a snippet (not including the main <html> tag).

I also updated its code-test routines and rationale description to reflect the additional requirements.

Read more comments on GitHub >

github_iconTop Results From Across the Web

XML Parser Error Codes - IBM
XML Parser Error Codes ; 2, The parser found an invalid start of a processing instruction, element, comment, or document type declaration outside...
Read more >
XML parsing error - java - Stack Overflow
ERROR : org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x6) was found in the element content of the document. Please help me ...
Read more >
API Call returns an 'XML Error Text' error
Why am I getting the following error message when I make an AddItem request when I try to add HTML to my descrption?...
Read more >
XML DOM Parser Errors - W3Schools
When trying to open an XML document, a parser-error may occur. If the parser encounters an error, it may load an XML document...
Read more >
XML Reader: Error [ExpectedEqSign] occurred while parsing ...
This error occurs when the input XML file is truncated and becomes invalid. It can occur due to insufficient precision of the ports...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found