question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`append()` and `prepend()` messing up with DOM on @1.0.0-rc.5

See original GitHub issue

Disclaimer: not very experienced with Cheeriojs

Windows 10 Node.js 14.15.4

Using a very simple source HTML file (note file is UTF-8 BOM with Windows CRLF EOLs):

<!DOCTYPE html>
<html>
<head>
<title></title>
<meta charset ='utf-8'/></head>
<body>
<strong><span style="font-size: 14pt;">January 1 – December 31, 2021<br /></span></strong>
</body>
</html>

I have a very simple code to prepend a <style> element with inline css to the <head> element.

let $ = cheerio.load(data);
let htmlString= fs.readFileSync(htmlFile, 'utf8');
let $ = cheerio.load(htmlString);
$('head').prepend('<style type="text/css">cssData</style>');
fs.writeFileSync(fileName, $.root().html(), 'utf8');

When using @1.0.0-rc.5, the elements are messed up. Look how title and meta are moved to the body. Also, <!DOCTYPE html> is removed. (BOM is removed, EOLs change from CRLF to LF)

<html>
<head>
<style type="text/css">
{{myCSS}}
</style>
</head>
<body>
<title></title>
<meta charset="utf-8">
<strong><span style="font-size: 14pt;">January 1 – December 31, 2021<br></span></strong>
</body>
</html>

This does not happen with @0.22.0. However, the BOM is converted to entities (line breaks change from CRLF to LF):

&#xFEFF;<!DOCTYPE html>

(maybe there are options to load/serialize the string and keep BOM, but I can’t find documentation or articles)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
fb55commented, Jun 17, 2021

@niemyjski No, this is caused by parse5 producing a spec-compliant DOM structure. You might want to run prettier on the output, or set the _useHtmlparser2 option

1reaction
5saviahvcommented, Mar 26, 2021
  • This CRLF conversion is done by parse5
  • with intact BOM parse5 seems to get confused, since it is not whitespace it switches to in_body and ignores doctype since directive can not happen in html body.

I suggest removing BOM before handling data and adding it back before writing. Some simple macros for removing and adding BOM.

// Macros for BOM
// deletes BOM from string
const delBOM = (str) => (str.charCodeAt(0) === 0xfeff ? str.slice(1) : str);
// adds BOM and converts line feed for Windows
const addBOM = (str) => (String.fromCharCode(0xfeff) + str.replace(/\n/g, '\r\n'));

and their usage

let htmlString = fs.readFileSync(htmlFile, 'utf8');
let $ = cheerio.load(delBOM(htmlString));
$('head').prepend('<style type="text/css">cssData</style>');
fs.writeFileSync(fileName, addBOM($.root().html()), 'utf8');

they are far from foolproof, but maybe make life little easier.

Read more comments on GitHub >

github_iconTop Results From Across the Web

append(), prepend(), .after() and .before() - Stack Overflow
See: .append() puts data inside an element at last index and .prepend() puts the prepending elem at first index. suppose: <div class='a'> //<---you...
Read more >
Element.prepend() - Web APIs - MDN Web Docs
The Element.prepend() method inserts a set of Node objects or string objects before the first child of the Element .
Read more >
IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
The pandas I/O API is a set of top level reader functions accessed like pandas.read_csv() that generally return a pandas object.
Read more >
HAPI - GitHub Pages
This method returns a configuration bean, much like getParserConfiguration() which can be used to configure receiving servers created by the given context.
Read more >
Parsing HTML in Node.js with Cheerio - LogRocket Blog
Basic familiarity with HTML, CSS, and the DOM; Familiarity with npm and Node.js ... Loading can be achieved with the cheerio.load() method, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found