Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PDF Generation hanging for documents with many large images

See original GitHub issue

Steps to reproduce

Tell us about your environment:

Puppeteer version: 0.13.0
Platform / OS version: OSX 10.13.1
Node versions: tested with v8.6.0, v8.9.0 & v9.2.0
URLs (if applicable): N/A

What steps will reproduce the problem?

The code below isolates a problem discovered while using Puppeteer to generate PDF reports. While Puppeteer has been fantastic so far, I have come across a problem with pages containing a large number of images.

index.html:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <title>Test App</title>
  </head>
  <body>
    <div id="root"></div>
  </body>
</html>

pdfImages.js:

const puppeteer = require('puppeteer');

const IMAGE_URIS = [
  // A list of 1949 image URIs as strings
  // The images used range in size between 1KB to 1.7MB with an average size of 200KB.
];

const PDF_OPTIONS = {path: `${__dirname}/output.pdf`, format: 'A4', printBackground: true, landscape: false, margin: {top: '10mm', left: '10mm', right: '10mm', bottom: '15mm'}};
const IMAGE_LOADING_TIMEOUT = 60 * 1000 * 5;

function addImagesToPage(imageUriList) {
  const root = document.getElementById('root');
  imageUriList.forEach((imageUri) => {
    const div = document.createElement('div');
    const img = new Image();
    img.src = imageUri;
    img.style = 'max-width: 20vw; max-height: 20vh;';
    div.appendChild(img);
    root.appendChild(div);
  });
}

function waitForAllImagesToCompleteLoading() {
  const allImagesInDocument = Array.from(document.getElementsByTagName('img'));
  return allImagesInDocument
      .map((img) => img.complete)
      .every((completeStatus) => completeStatus);
}

let browser;
puppeteer.launch({headless: true})
  .then((newBrowser) => {
    browser = newBrowser;
    return browser.newPage();
  })
  .then((page) => {
    return page.goto(`file://${__dirname}/index.html`)
    .then(() => page.evaluate(addImagesToPage, IMAGE_URIS))
    .then(() => page.waitForFunction(waitForAllImagesToCompleteLoading, {timeout: IMAGE_LOADING_TIMEOUT}))
    .then(() => page.pdf(PDF_OPTIONS))
    .then(() => page.close());
  })
  .then(() => browser.close());

I run this using: env DEBUG="puppeteer:*" node --max-old-space-size=16384 pdfImages.js

What is the expected result? Running this code should generate a PDF.

Please note that running this with 1100 images and with --max-old-space-size=8174, the code runs without a problem.

What happens instead? Running the code with the command above causes the code to hang on the PDF generation stage. Please see HungPdfPrint.log for the logs produced when this happens.

The code consistently hangs in the following combinations:

When --max-old-space-size=8174 (8GB) or 16348 (16GB) and,
When the number of images is 1200, 1500 or 1949.

The code crashes when the --max-old-space-size flag is removed with an out of heap memory error. See this log: OutOfMemoryCrash.log

Again, when the number of images is 1100 and --max-old-space-size=8174 the code runs without a problem

Issue Analytics

State:
Created 6 years ago
Reactions:17
Comments:27

Top GitHub Comments

6reactions

HugoDFcommented, Dec 8, 2017

I’m having the same issues taking PDF snapshots of HTML containing images, where the combined image size is more than 200MB, if we have 193MB of images it works but over 200 it hangs.

5reactions

galvezcommented, Oct 23, 2018

@Mgonand the only way you’re getting this to work reliably is by creating a batch of PDF files and then joining them in a final file (I used pdftk). I use puppeteer to generate series of 10-image PDFs and then join one final file.

Top Results From Across the Web

Generating large pdf in javascript: browser kills hanging page

I'm developping an image cutting application using javascript. The main goal is to cut an image into easily printable pieces and generate a ......

PDF sizing (when there's lots of images) - Content - SitePoint

I'm needing to put together some PDFs each contain a lot of images, however when I create the PDF I'm left with an...

How to Fix Slow PDF Files - Small Business - Chron.com

PDFs can be slow because they hold too much data or contain unnecessary objects inserted by the programs that created the PDFs. Approaches...

3 Ways to Reduce PDF File Size - wikiHow

1. Go to https://www.adobe.com/acrobat/online/compress-pdf.html. Adobe offers a free online PDF-compression tool that's great for reducing PDF file sizes. 2. Click the Select a...

5 Ways to Convert PDF to Image Files - wikiHow

1. Go to http://pdftoimage.com/ in your computer's web browser. This site allows you to convert entire PDFs into separate JPEG or PNG files. Warning:...