question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Optimizing Puppeteer PDF generation - more resources?

See original GitHub issue

Steps to reproduce

Tell us about your environment:

  • Puppeteer version: ^1.9.0
  • Platform / OS version: CentOS
  • URLs (if applicable):
  • Node.js version: 10.16.11

What steps will reproduce the problem?

Please include code that reproduces the issue.

        timing.start(`generate_page_pdf`);
        await on(page.pdf({
          path: `${tmpdir}${fileSystemId}.${pageNum}.pdf`,
          landscape: true,
          printBackground: true
        }));

        logEvent({ action: 'file_write', pid: engagement_id, session: sessionId, result: `${tmpdir}${fileSystemId}.${pageNum}.pdf`, args: req.transactionId});

       // Check for existence of a lionk to click to navigate to the next page.
        const dontClickLink = linkHref && linkHref.indexOf('endofreport') > -1;

        reportEnd = !nextPageLink || dontClickLink;

        if (nextPageLink && !dontClickLink) {
          pageNum += 1;
          await on(page.evaluate(() => {
            document.querySelector('#report-progress-link').click();
          }));
          await on(checkPdfSemaphore(70));
        }

        logEvent({action: 'generate_page_pdf', pid: engagement_id, session: sessionId, result: timing.end('generate_page_pdf'), args: req.transactionId});

      }

What is the expected result?

The above code works fine - the checkPdfSemaphore() function is waiting for 70 ms to ensure that the rather complex Angular code on the page being PDF’d has had time to complete before taking a PDF snapshot and moving to the next page.

As of now, the timing of generate_page_pdf is around 250-300ms per page, subtracting the 70ms for the semaphore, this is 180-230ms per page for rendering. This is all fine until I reach the 30s timeout on our Heroku instance. This happens in the 90-100 page range.

I’m tasked with bringing this time down - are there flags on Chromium that I can use to ensure that it takes up more resources? Are there other optimizations I can try?

Note: I’ve tried using setContent() instead of page.goto() - it makes no real difference in the initial load of the report, which is around 7s.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:5

github_iconTop GitHub Comments

1reaction
bh-hsicommented, Jun 30, 2021

@milczarekIT With the latest Puppeteer, you can key onto font events and other page load events that ensure that all of the resources are finalized - the only other solution that we implemented was bringing all fonts and outside frameworks (Bootstrap, for example) local to the file system, so that loading is faster, and render the PDFDs using Handlebars and setContent() to speed up page rendering.

So far, there’s no real flags in Chromium or Puppeteer that help.

0reactions
stale[bot]commented, Jul 24, 2022

We are closing this issue. If the issue still persists in the latest version of Puppeteer, please reopen the issue and update the description. We will try our best to accomodate it!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Optimizing Puppeteer PDF generation - more resources? #4965
The above code works fine - the checkPdfSemaphore() function is waiting for 70 ms to ensure that the rather complex Angular code on...
Read more >
Scalability for intensive pdf generation tasks on a node.js app ...
The issue, is it takes about 7000 ms to generate a pdf, mainly because of the three puppeteer functions : launch (launch the...
Read more >
8 Tips for Faster Puppeteer Screenshots - Bannerbear
When optimizing Puppeteer, remember that there are only so many ways to speed up the startup/shutdown performance of Puppeteer itself. Most likely ...
Read more >
Puppeteer HTML to PDF Generation with Node.js
Learn to generate a Puppeteer PDF document from a heavily styled React page using Node.js, headless Chrome and Docker.
Read more >
Improve performance generate pdf using puppeteer
My first post about generating pdf using puppeteer at here. At this post, I want to share about tip how to improve performance...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found