question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issues with large page numbers (>60k)

See original GitHub issue

Description

I’ve been setting up a content surfacing system using GatsbyJS, and we’re encountering a fair few issues with the amount of pages we have. I’ve made a few changes as suggested in Discord, and done a fair bit of investigating into the cause of the slowdowns.

The following symptoms have been noticed:

  1. npm run develop is significantly slower than npm run build (50 minutes vs 2 minutes)
  2. Slowdown occurs during the “running graphql queries” step, but before the text has shown
  3. Some machines encounter an “invalid instruction” crash after the “info bootstrap finished” text (http://paste.enginehub.org/tb7s6C)
  4. It appears to be running a query per page, despite the fact that the pages have no queries. There appears to already be an issue for this (https://github.com/gatsbyjs/gatsby/issues/12216)

A few notes about our setup:

  • All data is passed to pages/templates via the context, rather than a per-page query (As per discord recommendation)
  • I can provide the source code to the Gatsby team privately if required

What we’ve discovered:

  • The major slowdowns appear to be related to the queue library, and a QuickSort that gets run on insertion of elements. From what we can tell, the “Running graphql queries” step is actually running a list of tasks that may not be related to graphql, so it may be worth renaming this.
  • As for why it significantly slows down in develop, we discovered that in build the priority function is deleted from the queue. We’ve noticed the same speedups by setting the priority of non-active paths to undefined, which skips sorting for them entirely. (https://github.com/diamondio/better-queue-memory/blob/cff881f2074ff0508bcb6e932bda0b92977d3d2b/index.js#L48)
  • Halving the page number takes the time from 50 minutes to 10 minutes, so it’s not a linear slowdown.

Steps to reproduce

Using the source code that I can provide privately:

  1. Run npm run build, notice the time it takes to run, and then crashes Node.
  2. Run npm run develop, notice that it takes significantly longer, with the same crash.

Expected result

Gatsby should be able to handle this quantity of pages, as there are multiple sources that state they’re running ~10 million with little to no issue.

Actual result

Gatsby struggles at these page numbers.

Environment

System: OS: macOS 10.14.2 CPU: (12) x64 Intel® Core™ i9-8950HK CPU @ 2.90GHz Shell: 3.2.57 - /bin/bash Binaries: Node: 10.15.3 - ~/.nvm/versions/node/v10.15.3/bin/node npm: 6.8.0 - ~/.nvm/versions/node/v10.15.3/bin/npm Languages: Python: 2.7.15 - /usr/local/bin/python Browsers: Chrome: 72.0.3626.119 Safari: 12.0.2 npmPackages: gatsby: ^2.1.23 => 2.1.23 gatsby-image: ^2.0.31 => 2.0.31 gatsby-plugin-catch-links: ^2.0.12 => 2.0.12 gatsby-plugin-manifest: ^2.0.22 => 2.0.22 gatsby-plugin-offline: ^2.0.24 => 2.0.24 gatsby-plugin-react-helmet: ^3.0.8 => 3.0.8 gatsby-plugin-sharp: ^2.0.25 => 2.0.25 gatsby-plugin-styled-components: ^3.0.6 => 3.0.6 gatsby-plugin-typescript: ^2.0.10 => 2.0.10 gatsby-plugin-web-font-loader: ^1.0.4 => 1.0.4 gatsby-source-filesystem: ^2.0.23 => 2.0.23 gatsby-transformer-json: ^2.1.8 => 2.1.8 gatsby-transformer-sharp: ^2.1.15 => 2.1.15

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:5
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
DSchaucommented, Apr 15, 2019

@me4502 I believe we’ve fixed this with #10732 and gatsby@^2.3.20.

Closing this out–but please re-open or reply if this is not the case and you can still reproduce these OOM issues.

We’re always working on making Gatsby more scalable, and the more issues we can surface and fix–the better. Thanks for surfacing this one!

1reaction
me4502commented, Mar 7, 2019

@KyleAMathews That option doesn’t appear to speed it up too much, however I’ve made a PR that brings develop performance to basically the same as build performance (https://github.com/gatsbyjs/gatsby/pull/12365)

@stefanprobst It appears that does fix the crash, even just switching the json-stringify-safe with JSON.stringify fixes it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

generate a site with about 60K posts, take forever #560 - GitHub
Disabling pages that display aggregate data does not mean that Jekyll does not still compute the aggregate data. It's a consequence of the...
Read more >
60k row excel performance complaint : r/sysadmin - Reddit
60k isn't really all that large for an Excel spreadsheet. It's probably inefficient formulas, etc. A completely separate issue is whether ...
Read more >
Segmentation fault at LAPACK zgeev for large matrices (N ...
I have to solve very large eigenvalue problem (complex, no symmetries, so zgeev is used) and encountered the following issue: for very large...
Read more >
Linux Performance Analysis in 60000 Milliseconds
Larger than expected average times can be an indicator of device saturation, or device problems. avgqu-sz: The average number of requests issued to...
Read more >
MYSQL innodb with almost max number of varchar columns ...
So, given that my table will have rows of approx 60,000(!) bytes each, is this a problem for innodb? Is it a big...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found