PDF Generation hanging for documents with many large images
See original GitHub issueSteps to reproduce
Tell us about your environment:
- Puppeteer version: 0.13.0
- Platform / OS version: OSX 10.13.1
- Node versions: tested with v8.6.0, v8.9.0 & v9.2.0
- URLs (if applicable): N/A
What steps will reproduce the problem?
The code below isolates a problem discovered while using Puppeteer to generate PDF reports. While Puppeteer has been fantastic so far, I have come across a problem with pages containing a large number of images.
index.html:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Test App</title>
</head>
<body>
<div id="root"></div>
</body>
</html>
pdfImages.js:
const puppeteer = require('puppeteer');
const IMAGE_URIS = [
// A list of 1949 image URIs as strings
// The images used range in size between 1KB to 1.7MB with an average size of 200KB.
];
const PDF_OPTIONS = {path: `${__dirname}/output.pdf`, format: 'A4', printBackground: true, landscape: false, margin: {top: '10mm', left: '10mm', right: '10mm', bottom: '15mm'}};
const IMAGE_LOADING_TIMEOUT = 60 * 1000 * 5;
function addImagesToPage(imageUriList) {
const root = document.getElementById('root');
imageUriList.forEach((imageUri) => {
const div = document.createElement('div');
const img = new Image();
img.src = imageUri;
img.style = 'max-width: 20vw; max-height: 20vh;';
div.appendChild(img);
root.appendChild(div);
});
}
function waitForAllImagesToCompleteLoading() {
const allImagesInDocument = Array.from(document.getElementsByTagName('img'));
return allImagesInDocument
.map((img) => img.complete)
.every((completeStatus) => completeStatus);
}
let browser;
puppeteer.launch({headless: true})
.then((newBrowser) => {
browser = newBrowser;
return browser.newPage();
})
.then((page) => {
return page.goto(`file://${__dirname}/index.html`)
.then(() => page.evaluate(addImagesToPage, IMAGE_URIS))
.then(() => page.waitForFunction(waitForAllImagesToCompleteLoading, {timeout: IMAGE_LOADING_TIMEOUT}))
.then(() => page.pdf(PDF_OPTIONS))
.then(() => page.close());
})
.then(() => browser.close());
I run this using:
env DEBUG="puppeteer:*" node --max-old-space-size=16384 pdfImages.js
What is the expected result? Running this code should generate a PDF.
Please note that running this with 1100 images and with --max-old-space-size=8174, the code runs without a problem.
What happens instead? Running the code with the command above causes the code to hang on the PDF generation stage. Please see HungPdfPrint.log for the logs produced when this happens.
The code consistently hangs in the following combinations:
- When --max-old-space-size=8174 (8GB) or 16348 (16GB) and,
- When the number of images is 1200, 1500 or 1949.
The code crashes when the --max-old-space-size flag is removed with an out of heap memory error. See this log: OutOfMemoryCrash.log
Again, when the number of images is 1100 and --max-old-space-size=8174 the code runs without a problem
Issue Analytics
- State:
- Created 6 years ago
- Reactions:17
- Comments:27
Top GitHub Comments
I’m having the same issues taking PDF snapshots of HTML containing images, where the combined image size is more than 200MB, if we have 193MB of images it works but over 200 it hangs.
@Mgonand the only way you’re getting this to work reliably is by creating a batch of PDF files and then joining them in a final file (I used pdftk). I use puppeteer to generate series of 10-image PDFs and then join one final file.