question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Uneven .md vs. .mdx build times

See original GitHub issue

Description

We run a documentation website that has hit a performance bottleneck at around 1000 pages. This led us to test the difference between gatsby-transformer-remark and gatsby-plugin-mdx to compare .md and .mdx build times.

We realize that these are not the same plugin, but our expectations were that the build times of each would be closer in-line with one another (for the exact same files).

We used the following repo to benchmark results: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx

Here were the results of our test using the auto generated files:

Source and Transform Nodes
# of Pages md mdx
100 0.17s 3.12s
1000 0.90s 23.05s
8000 5.53s 192.80s

Steps to reproduce

GitHub repo: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx

For markdown files:

  1. Make sure line 54 of md.generate.js is set to create .md
  2. Make sure gatsby-transformer-remark is used (and not gatsby-plugin-mdx) in the gatsby-config.js file
  3. Run npm run bench or yarn run bench

For mdx files:

  1. Make sure line 54 of md.generate.js is set to create .mdx
  2. Make sure gatsby-plugin-mdx is used (and not gatsby-transformer-remark) in the gatsby-config.js file
  3. Run npm run bench or yarn run bench

Expected result

Results of the gatsby build process for .md and .mdx files should be within reasonable distance of one another.

Actual result

Build times of .mdx were between 18 and 34 times longer for the source and transform nodes step vs. .md files.

Environment

System: OS: Windows 10 CPU: (16) x64 Intel® Core™ i9-9900K CPU @ 3.60GHz Binaries: Yarn: 1.18.0 - C:\Program Files (x86)\Yarn\bin\yarn.CMD npm: 6.9.0 - C:\Program Files\nodejs\npm.CMD Languages: Python: 2.7.15 - /c/Users/JYoun/.windows-build-tools/python27/python Browsers: Edge: 44.18362.449.0 npmPackages: gatsby: ^2.19.5 => 2.19.45 gatsby-image: ^2.2.39 => 2.2.44 gatsby-plugin-benchmark-reporting: * => 0.0.13 gatsby-plugin-feed: ^2.3.26 => 2.3.29 gatsby-plugin-google-analytics: ^2.1.34 => 2.1.38 gatsby-plugin-manifest: ^2.2.38 => 2.2.48 gatsby-plugin-mdx: ^1.0.83 => 1.0.83 gatsby-plugin-offline: ^3.0.32 => 3.0.41 gatsby-plugin-react-helmet: ^3.1.21 => 3.1.24 gatsby-plugin-sharp: ^2.4.0 => 2.4.13 gatsby-plugin-typography: ^2.3.21 => 2.3.25 gatsby-remark-copy-linked-files: ^2.1.36 => 2.1.40 gatsby-remark-images: ^3.1.42 => 3.1.50 gatsby-remark-prismjs: ^3.3.30 => 3.3.36 gatsby-remark-responsive-iframe: ^2.2.31 => 2.2.34 gatsby-remark-smartypants: ^2.1.20 => 2.1.23 gatsby-source-filesystem: ^2.1.46 => 2.1.56 gatsby-transformer-remark: ^2.6.48 => 2.6.59 gatsby-transformer-sharp: ^2.3.13 => 2.3.19

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:4
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
johnocommented, Mar 25, 2020

Here’s where I’m at so far: yesterday I ran benchmarks and could confirm I’m seeing the same growth in time during sourcing/transforming nodes. Each time I double the number of MDX pages I see ~2x increase in time in “source and transform nodes”. It appears to be continually growing, too. 🏒

Files Total build Source and transform nodes
500 30s 13s
1k 51s 23s (+1.7x)
2k 90s 47s (+2x)
4k 180s 100s (+2x)
8k 440s 230s (+2.3x)
16k 1100s 530s (+2.3x)

After some digging I found a usage of a manual node filter for type using getNodes rather than getNodesByType. This, unsurprisingly, caused a lot of unnecessary node traversal at scale. That fix (in #22555) saw a ~30% reduction in build times for 16k pages (using my benchmark site so YMMV).

My benchmark results after the change

Files Total build Source and transform nodes
8k 350s 150s
16k 750s 400s

I’ll keep digging into this as I get time and will report back any additional findings and performance improvements as we get them PRed in. I suspect there’s a lot more low hanging fruit.


It’s also important to note that MDX will always be quite a bit slower than MD since it’s doing a lot more under the covers, but it definitely shouldn’t be 📈🙀

2reactions
johnayoungcommented, Mar 25, 2020

@johno Thanks for the updates here. Here is the second round of testing after pulling down the new packages (benchmark repo: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx):

Source and Transform Nodes
# of Pages mdx (pre-patch) mdx (post-patch) dif
100 3.12 3.7 0.58
1000 23.05 24.3 1.25
8000 192.8 183.4 -9.4

Looks like ~5% better performance in the upper bucket, with pretty much even performance in the lower buckets.

If you guys need me to test anything additional, or hunches, happy to do so.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How I Built my Blog using MDX, Next.js, and React
An in-depth look at the technical stack behind this very blog! We'll see how I use Next's API routes to implement my hit...
Read more >
Build Quality - AcuraZine - Acura Enthusiast Community
My 2019 MDX has uneven gaps around the rear wheel wells. It is my experience this is typical of Acura. This is where...
Read more >
Migrating this Blog to Next.js from Gatsby
Since our goal is to generate static pages at build time, we should first ... You can source your .mdx files (other directories...
Read more >
2022 Acura MDX SH-AWD Advance Review and Off-Road ...
The 2022 Acura MDX has been completely revised for the new year - is this what the brand needs? We take a look...
Read more >
Acura MDX - Wikipedia
The Acura MDX is a mid-size luxury crossover SUV with three-row seating produced by the Japanese automaker Honda under its luxury Acura division...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found