Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Uneven .md vs. .mdx build times

See original GitHub issue

Description

We run a documentation website that has hit a performance bottleneck at around 1000 pages. This led us to test the difference between gatsby-transformer-remark and gatsby-plugin-mdx to compare .md and .mdx build times.

We realize that these are not the same plugin, but our expectations were that the build times of each would be closer in-line with one another (for the exact same files).

We used the following repo to benchmark results: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx

Here were the results of our test using the auto generated files:


Source and Transform Nodes
# of Pages	md	mdx
100	0.17s	3.12s
1000	0.90s	23.05s
8000	5.53s	192.80s

Steps to reproduce

GitHub repo: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx

For markdown files:

Make sure line 54 of md.generate.js is set to create .md
Make sure gatsby-transformer-remark is used (and not gatsby-plugin-mdx) in the gatsby-config.js file
Run npm run bench or yarn run bench

For mdx files:

Make sure line 54 of md.generate.js is set to create .mdx
Make sure gatsby-plugin-mdx is used (and not gatsby-transformer-remark) in the gatsby-config.js file
Run npm run bench or yarn run bench

Expected result

Results of the gatsby build process for .md and .mdx files should be within reasonable distance of one another.

Actual result

Build times of .mdx were between 18 and 34 times longer for the source and transform nodes step vs. .md files.

Environment

System: OS: Windows 10 CPU: (16) x64 Intel® Core™ i9-9900K CPU @ 3.60GHz Binaries: Yarn: 1.18.0 - C:\Program Files (x86)\Yarn\bin\yarn.CMD npm: 6.9.0 - C:\Program Files\nodejs\npm.CMD Languages: Python: 2.7.15 - /c/Users/JYoun/.windows-build-tools/python27/python Browsers: Edge: 44.18362.449.0 npmPackages: gatsby: ^2.19.5 => 2.19.45 gatsby-image: ^2.2.39 => 2.2.44 gatsby-plugin-benchmark-reporting: * => 0.0.13 gatsby-plugin-feed: ^2.3.26 => 2.3.29 gatsby-plugin-google-analytics: ^2.1.34 => 2.1.38 gatsby-plugin-manifest: ^2.2.38 => 2.2.48 gatsby-plugin-mdx: ^1.0.83 => 1.0.83 gatsby-plugin-offline: ^3.0.32 => 3.0.41 gatsby-plugin-react-helmet: ^3.1.21 => 3.1.24 gatsby-plugin-sharp: ^2.4.0 => 2.4.13 gatsby-plugin-typography: ^2.3.21 => 2.3.25 gatsby-remark-copy-linked-files: ^2.1.36 => 2.1.40 gatsby-remark-images: ^3.1.42 => 3.1.50 gatsby-remark-prismjs: ^3.3.30 => 3.3.36 gatsby-remark-responsive-iframe: ^2.2.31 => 2.2.34 gatsby-remark-smartypants: ^2.1.20 => 2.1.23 gatsby-source-filesystem: ^2.1.46 => 2.1.56 gatsby-transformer-remark: ^2.6.48 => 2.6.59 gatsby-transformer-sharp: ^2.3.13 => 2.3.19

Issue Analytics

State:
Created 3 years ago
Reactions:4
Comments:11 (5 by maintainers)

Top GitHub Comments

2reactions

johnocommented, Mar 25, 2020

Here’s where I’m at so far: yesterday I ran benchmarks and could confirm I’m seeing the same growth in time during sourcing/transforming nodes. Each time I double the number of MDX pages I see ~2x increase in time in “source and transform nodes”. It appears to be continually growing, too. 🏒

Files	Total build	Source and transform nodes
500	30s	13s
1k	51s	23s (+1.7x)
2k	90s	47s (+2x)
4k	180s	100s (+2x)
8k	440s	230s (+2.3x)
16k	1100s	530s (+2.3x)

After some digging I found a usage of a manual node filter for type using getNodes rather than getNodesByType. This, unsurprisingly, caused a lot of unnecessary node traversal at scale. That fix (in #22555) saw a ~30% reduction in build times for 16k pages (using my benchmark site so YMMV).

My benchmark results after the change

Files	Total build	Source and transform nodes
8k	350s	150s
16k	750s	400s

I’ll keep digging into this as I get time and will report back any additional findings and performance improvements as we get them PRed in. I suspect there’s a lot more low hanging fruit.

It’s also important to note that MDX will always be quite a bit slower than MD since it’s doing a lot more under the covers, but it definitely shouldn’t be 📈🙀

2reactions

johnayoungcommented, Mar 25, 2020

@johno Thanks for the updates here. Here is the second round of testing after pulling down the new packages (benchmark repo: https://github.com/johnatspreadstreet/gatsby-md-vs-mdx):


Source and Transform Nodes
# of Pages	mdx (pre-patch)	mdx (post-patch)	dif
100	3.12	3.7	0.58
1000	23.05	24.3	1.25
8000	192.8	183.4	-9.4

Looks like ~5% better performance in the upper bucket, with pretty much even performance in the lower buckets.

If you guys need me to test anything additional, or hunches, happy to do so.

Top Results From Across the Web

How I Built my Blog using MDX, Next.js, and React

An in-depth look at the technical stack behind this very blog! We'll see how I use Next's API routes to implement my hit...

Build Quality - AcuraZine - Acura Enthusiast Community

My 2019 MDX has uneven gaps around the rear wheel wells. It is my experience this is typical of Acura. This is where...

Migrating this Blog to Next.js from Gatsby

Since our goal is to generate static pages at build time, we should first ... You can source your .mdx files (other directories...

2022 Acura MDX SH-AWD Advance Review and Off-Road ...

The 2022 Acura MDX has been completely revised for the new year - is this what the brand needs? We take a look...

Acura MDX - Wikipedia

The Acura MDX is a mid-size luxury crossover SUV with three-row seating produced by the Japanese automaker Honda under its luxury Acura division...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Uneven .md vs. .mdx build times

Description

Steps to reproduce

Expected result

Actual result

Environment

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Cannot destructure property `wpgraphql` of 'undefined' or 'null'

gatsby-plugin-mdx: jsxFrag should be set with latest version of @babel-plugin-transform-react-jsx