question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

☂️ This umbrella issue is for tracking work related to improving performance to MDX.

I’ve been working with @pvdz on MDX performance. We’ve noted a few aspects that add unnecessary work which we should be able to reduce, especially in v2.

Numerous babel parse and transformation steps

Firstly, we have multiple babel parse steps throughout the MDX transpilation pipeline.

Imports and exports

  • Partitioning imports and exports
  • Finding the default export

Peter has done some work here in gatsby-plugin-mdx that we can potentially adapt gatsbyjs/gatsby#25437 for usage in core.

Shortcode generation

We use babel to figure out what imports and exports exist, and then use that to instantiate variables coming from MDXProvider with makeShortcode. Also related to gatsbyjs/gatsby#25437

mdxType

This is used by the runtime (react/preact/vue) to determine which component to render. This is something we can do from the MDXAST in v2 since the JSX structure is represented.

Returning a compiled string that inevitably needs to be transpiled

Secondly, to these parse steps we also return a JSX string. In nearly all cases this JSX string is then transpiled to JS and mdx pragma function calls. This was originally an intentional output because we wanted to make MDX more palatable and familiar. However, it might make sense to serialize directly to function calls and JS.

This would remove a babel step users need (unless they’re using optional syntax or need browser polyfills which is still achievable in user land).


You all are welcome to bring up other areas of the codebase we can make more performant or other ideas as well! In fact, we’d love your thoughts.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:13
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
pvdzcommented, Jul 17, 2020

Yeah so if we keep certain artificials limitations (which already apply today) in place then we can distill the imports/exports from the mdast without the need of Babel. That’s been the source of some significant perf improvements at startup time (like https://github.com/gatsbyjs/gatsby/pull/25757).

The reasoning here is that the import and export syntax is very strict and if we disallow comments in between then a regular expression or simple string manipulation can quickly get us the answers we need (-> the symbols being imported and exported).

For imports the only limitation might be not to allow comments inside an import and only at the end of a line. These are the forms of import:

  • import ID from 'y'
  • import * as ID from 'y'
  • import {ID} from 'y'
  • import {ID as ID2} from 'y'
  • import ID, {ID2} from 'y'
  • import * as ID, {ID2} from 'y'

The {} pattern can repeat and for each case as is optional. For the fix in Gatsby, to get the imported idents, I took these imports and used a regex to remove all parts that were not interested in, leaving us with comma separated sets of ID or ID as ID2. You can easily take the last ID and that’ll be the one you want.

Leaning on the fact that imports are constants (and valid input), no further need to dedupe them is required.

So to make life easy, the only syntactical restriction, beyond non-standard syntax of course, is to disallow comments inside the import declaration. And maybe disallow the variant where from is omitted (where you import a module for side effects).

For exports it’s a little trickier, mainly because you can export arbitrary expressions and because of defaults in destructuring. However, it turns out that exports are currently limited to a single line. That’s great because that makes them easy to slice out.

Further more, if you apply the same comment restriction to exports and disallow destructuring defaults, you can “cheat” your way out of not requiring any JS parser and still distill all the exported symbols, as well as finding the default export. You can even support the newer export <pattern> from 'file', which I believe is currently not supported.

  • export default function abc(){}
  • export const foo = bar
  • export class Boo {}
  • export { ding, dong as dang }
  • export let [a, b] = obj
  • export let [a = 1, b = 2] = obj <-- this is the one to disallow

In all the above cases, except last, you can parse up to the first = character (for var, let, and const exports) to get all the exported symbol names safely. The syntax for function and class is restricted enough by itself. The re-export syntax can be done similarly as the imports above. All in all, it’ll be much faster than the overhead of a full JS parse.

For JSX serialization you can use a faster parser/printer than Babel. I know Acorn can do it. There’s also Sucrase, and a few others.

My suggestion to John was to default to anything fast and to expose an option for the user to do it for you instead, since mdx doesn’t reaaally care how the jsx gets compiled to JS. Or wouldn’t need to, as far as I understand. So a user could give mdx a callback like function callback(jsxString) { return parser(jsxString).serialize(pragma); } and mdx would just run it instead.

If I’m not mistaken, this way MDX wouldn’t need to run a JS parser at all.

One other potential trick is to concat the expressions with a searchable separator (an identifier of sorts or the debugger statement) and concat the jsx expressions together. Feed them to a parser, print them again, split on the debugger statement (or whatever you pick). That may already be what’s happening now, I’m not sure…?

Oh and a third option is to allow the user to pass through a Babel config / options for the whole build step. That way if Babel is ran inside MDX anyways, it can just as well also do all the other transformations, like polyfill transforms etc, so that the main pipeline doesn’t need to process it again. Potentially. But that might be a pretty big pandora’s box of complexity to open up.

1reaction
wooormcommented, Jul 17, 2020

Probably also faster if we only process jsx there, and nothing else, leaving that up to folks. But indeed, wondering on the benchmarks of 100 expressions vs 1 file

Read more comments on GitHub >

github_iconTop Results From Across the Web

2023 Acura MDX | A Legacy Of Performance Continues
The MDX becomes the first SUV to join our heritage of Type S high-performance vehicles with a turbocharged V-6 that produces 355-HP 126...
Read more >
2023 Acura MDX Review, Pricing, and Specs - Car and Driver
The regular MDX is powered by a 290-hp 3.5-liter V-6 that is mated to a 10-speed automatic transmission. · The base MDX boasts...
Read more >
2023 Acura MDX Performance, HP & Engine Options
Base engine: 3.5-liter V6 with 290 horsepower and 267 pound-feet of torque ; Available engine: turbocharged 3.0-liter V6 with 355 horsepower and 354...
Read more >
What Can You Expect from the Acura MDX Performance?
With the 3.0L Sport Hybrid engine, you'll have access to an impressive 321 hp performance, as well as a 7-speed Dual-Clutch Transmission.
Read more >
Acura MDX Type S Performance
The MDX Type S raises the bar for performance. Hold nothing back when you hit the road with a 355-HP 3.0L turbo V6...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found