question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parsing self closing tag produce bad html

See original GitHub issue

Self closing tag is been removed.

Example :

cheerio.load(`<div class="n-content-video n-content-video--youtube">
			<iframe src="https://www.youtube.com/?rel=0"/>
		</div>`).html()

This produces :

'<html><head></head><body><div class="n-content-video n-content-video--youtube"> <iframe src="https://www.youtube.com/?rel=0"> </div></iframe></div></body></html>'

As you can see closing slash from iframe has disappeared producing bad html syntax

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
5saviahvcommented, Aug 14, 2022

When I put code (from above) into Chrome browser I got:

<html><head></head><body><div class="n-content-video n-content-video--youtube">
<iframe src="https://www.youtube.com?rel=0"></iframe>
</div><div><div class="n-recommended"></div></div>
</body></html>

so browser actually closes iframe before it’s parent element div

maybe you should decode content first, so you can avoid this “repairing” functionality.

// decode self closed tags as fragment
const decodedHTML = cheerio.load(selfclosedHTML, { xmlMode: true }, false).html({ xmlMode: false });

// and now use it as regular 
console.info(cheerio.load(decodedHTML).html());

result:

<html><head></head><body><div class="n-content-video n-content-video--youtube">
<iframe src="https://www.youtube.com?rel=0"></iframe>
</div><div><div class="n-recommended"></div></div></body></html>
0reactions
juanSanchezAlcalacommented, Aug 3, 2022

The problem comes from transforming from xml to html

 cheerio.load(cheerio.load(`<div class="n-content-video n-content-video--youtube">
			<iframe src="https://www.youtube.com?rel=0"></iframe>
		</div><div><div class="n-recommended"></div></div>`).xml()).html()

it produces


'<html><head></head><body><div class="n-content-video n-content-video--youtube">
			<iframe src="https://www.youtube.com?rel=0">
		</div><div><div class="n-recommended"/></div></body></html></iframe></div></body></html>'

Shouldn’t be the output text the same as the input ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Are (non-void) self-closing tags valid in HTML5?
The slash at the end of the start tag is allowed, but has no meaning. It is just syntactic sugar for people (and...
Read more >
Template parser produces wrong locations for HTML self ...
Template parser from @angular/compiler package produces wrong locations for HTML self-closing tags. Here the initial simple template: <input ...
Read more >
How to respect non-self-closing br tags when apex parsing ...
This appears to be because this string is treated as "xhtml+xml" or maybe just "xml" content type, despite the class name being Dom.Document...
Read more >
T134423 Deprecate nonstandard behavior of self-closed ...
HTML5 just says that void elements can have any end tag and must then be self-closed or implicitly closed immediaterly without parsing any...
Read more >
Self Closing Tags in HTML (With Examples) - Tutorials Tonight
A self closing tags in HTML are the type of HTML tags that need not to be closed manually by its closing tag,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found