Headless Chrome Puppeteer generated PDF does not show some Unicode fonts on Acrobat
See original GitHub issueSteps to reproduce
- Use the HTML with Headless Chrome to generate a PDF.
- Open the PDF using Acrobat Reader, and notice that there’s a dialog box shown for missing fonts. Note that on different Windows systems, the PDF output is different (e.g, the PDF that I attached was generated on my system, and it shows some fonts missing, while on another Windows system, some other fonts were missing.) It seems that Headless embeds fonts differently than what Acrobat is expecting.
Tell us about your environment:
- Puppeteer version: 1.10.0
- Platform / OS version: Win10
- URLs (if applicable):
- Node.js version: 10.13.0
What steps will reproduce the problem? Use the code below to render the HTML file to generate a PDF.
Please include code that reproduces the issue.
const puppeteer = require('puppeteer');
const path = require('path');
async function createHeadlessChromeInstance() {
browser = await puppeteer.launch(
{
ignoreHTTPSErrors: false
}
);
const page = await browser.newPage();
// Set viewport to a fixed size.
await page.setViewport({width: 750, height: 600});
return page;
}
async function generatePdf() {
// create page instance
let page;
let navigationTimeout = 60000; // 1 min
let response;
try {
page = await createHeadlessChromeInstance();
response = await page.goto("file://" + 'c:\\testdata_font.htm', {
timeout: navigationTimeout,
waitUntil: ['load', 'networkidle2']
});
await page.waitFor(500);
}
catch (error) {
throw error;
}
// generate pdf for the current page
pdfProperties = {};
pdfProperties.path = 'output.pdf';
pdfProperties.margin = '1in';
pdfProperties.displayHeaderFooter = false;
pdfProperties.printBackground = true;
pdfProperties.width = '8.27in';
pdfProperties.height = '11.7in';
await page.pdf(pdfProperties);
await browser.close();
}
async function main() {
try {
await generatePdf();
} catch(e) {
return process.exit(1);
}
console.log("pdf conversion completed successfully.");
return process.exit(0);
}
main();
The HTML is:
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
<h1 style="font-family:Aldhabi;">This is a Aldhabi الخطوط العربية النصي</h1>
<h1 style="font-family:Arabic Typesetting;">Arabic Typesetting الخطوط العربية النصي</h1>
<h1 style="font-family:Shonar Bangla;">This is a Bangla Supplemental Fonts:</h1>
<h1 style="font-family:Shonar Bangla;">This is a বাংলায় টেক্সট</h1>
<h1 style="font-family:DengXian ;">This is a DengXian 中文文本</h1>
<h1 style="font-family:KaiTi;">This is a KaiTi 中文文本</h1>
<h1 style="font-family:DFKai-SB;">This is a DFKai-SB 中文文本</h1>
<h1 style="font-family:Aparajita;">This is a Aparajita तेक्स्त इन देवनगरी</h1>
<h1 style="font-family:Sanskrit Text;">This is a Sanskrit Text तेक्स्त इन देवनगरी</h1>
<h1 style="font-family:FrankRuehl;">This is a FrankRuehl טקסט בעברית</h1>
<h1 style="font-family:Meiryo;">This is a Meiryo テキストは日本語です</h1>
<h1 style="font-family:Tunga;">This is a Tunga ಪಠ್ಯವು ಕನ್ನಡದಲ್ಲಿದೆ</h1>
<h1 style="font-family:Batang;">This is a Batang 텍스트는 한국에있다</h1>
<h1 style="font-family:Karthika;">This is a Karthika ടെക്സ്റ്റ് മലയാളത്തിലാണ്</h1>
<h1 style="font-family:Gautami;">This is a Gautami, టెక్స్ట్ టెలోగిలో ఉంది</h1>
<h1 style="font-family:DilleniaUPC;">This is a DilleniaUPC ข้อความเป็นภาษาไทย</h1>
<h1 style="font-family:Latha;">This is a Latha தமிழ் மொழியில் உள்ளது</h1>
</body>
</html>
What is the expected result? A PDF file which is shown correctly on Acrobat Reader.
What happens instead? The PDF is shown incorrectly on Acrobat Reader. If I open the PDF with Chrome browser or some other PDF reader, then the fonts are shown. output.pdf The screenshot of Adobe error:
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
How to fix puppeteer font issues - browserless docs
The first thing to try out is to applying a --font-render-hinting flag. This value, which defaults to "full", sets a font render hinting...
Read more >Generated PDF doesn't show some characters in chrome
1 Answer 1 · You might try "subset" font embedding so that only the glyphs actually used are embedded. · The other way...
Read more >Unicode characters incorrect or missing with Puppeteer within ...
I have puppeteer running within docker, using node:16 as a base. Is it just that my base image doesn't have enough fonts? I've...
Read more >PDF Software in the UK - Page 5 - SourceForge
Partially convert specific pages or page range of PDFs. Create industry-standard PDF from PDF, Word, PowerPoint, Text, RTFD, HTML, EPUB, CHM and images,...
Read more >What do you use to make PDFs? What problems do you have?
I'd like to hear your stories about generating PDFs with PHP: What ... add the extra 500 mb for headless chrome and puppeteer...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
We are getting similar issues with the strong HTML element and puppeteer 1.19. One PDF with a different German texts inside the strong works fine whereas the French version is not shown at all. The French text does not contain any special characters where the German version has an Umlaut.
Also tried to use the same German text in the French PDF but this did not help. The complete string is missing in the PDF.
Very strange.
This issue should be re-opened because it’s puppeteer related. On standalone Chrome the PDFs are fine in the print preview and afterwards (on macOS).
Opened a bug here: https://bugs.chromium.org/p/chromium/issues/detail?id=915012
Thanks @aslushnikov for prompt responses!