question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

An immediately closed HTML tag does not parse properly

See original GitHub issue

I’m trying to parse some user’s input HTML which happens to have an empty <div></div> tag. When parsing into XML, I get this: <div/>

And then when parsing back into HTML, the <div> is then a wrapper for the inner content.

Ex:

<div></div>
<p>hello world</p>

Parsing into XML
const xml = $.load('<div></div><p>hello world</p>', {xmlMode: true});

Parsing back into HTML:
const newHtml = xml.html({decodeEntities: true});

Gives:

<div>
<p>hello world</p>
</div>

This is an issue because of styling that might happen on the containing div, etc.

I’m not exactly sure if this is a bug or intentional. I know that <div></div> doesn’t make much sense, but it’s something I’m dealt with. To clarify; if the div has a space within it, the HTML is parsed as-expected.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
nicksheencommented, Aug 14, 2018

fyi unicron’s repro above was obtained from the options

var content = cheerio.load('<p><strong></strong>Hello <br/>world</p>', {
                decodeEntities: false,
                xmlMode: true,
            });

This outputs xhtml content.html() === '<p><strong/>Hello <br/>world</p>'

Our difficulty with this is that we are outputting our document as html. Browsers cope with HTML containing <br/> but don’t cope with <strong/>, so we need to fix our inconsistencies. On further reflection I don’t think the issue we have experienced is with cheerio.

0reactions
fb55commented, Dec 22, 2020

Fixed by #985

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to correctly fix unclosed HTML tags with Nokogiri
First, your HTML is invalid because (besides missing closing tags) there's no <ul> or <ol> tag, so Nokogiri switches to guessing right away, ......
Read more >
HTML highlight parser error (like unclosed tags) #55479
When writing some simple HTML, I would like to see at a glance whether my html is correct or not. Webstorm use to...
Read more >
Self-Closing Tags in HTML 5 [Guide]
Wondering whether or not to close html tags? In this Treehouse blog we cover how to use self-closing tags in HTML.
Read more >
Parsing tags which are not closed from web page with ...
I am parsing and modifying HTML content using HtmlAgilityPack. The DocumentNode.OuterHtml seems to provide the needed closing tags, however....I ...
Read more >
Google homepage doesn't close html tags, on purpose
It seems to me that logic dictates it would take longer to parse a broken schema than a valid one. Perhaps leaving out...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found