question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slow parsing performance

See original GitHub issue

I’ve just started using this library to check that parsed JSON objects conform to the types I define. I’m really enjoying it, but I’m afraid I’ve hit a performance snag.

Everything works fine with small tests but when I ran it on a slightly bigger input file (~10MB) I noticed that it was really slow.

I’ve tried the latest version and the beta, and check vs parse, with similar results. The beta is faster, but I don’t see a big difference between check and parse.

After profiling the check version in the beta, I’m seeing calls to .check taking in the 500ms-1100ms range per object. image

Are those numbers typical, or am I doing something wrong with the schema definitions?

My schema definitions look like:

const EntryJsonSchema = z.object({
  a: z.string().optional().nullable(),
  b: z.string().optional().nullable(),
  id: z.string().optional().nullable(),
  creation: z.string().optional().nullable(),
  content: ContentJsonSchema.optional().nullable(),
  labels: z.string().array().optional().nullable(),
  answers: AnswerJsonSchema.array().optional().nullable(),
  results: ResultJsonSchema.array()
    .optional()
    .nullable(),
});

const ContentJsonSchema = z.object({
  id: z.string().optional().nullable(),
  title: z.string().optional().nullable(),
  version: z.union([z.number(), z.string()]).optional().nullable(),
});

const AnswerJsonSchema = z.object({
  key: z.string().optional().nullable(),
  value: z.any().optional().nullable(),
});

const ResultJsonSchema = z.object({
  key: z.string().optional().nullable(),
  value: z.any().optional().nullable(),
});

I’m really hoping that there is a way to speed it up, as this is too expensive for the use case where I’ll have to process files with 100k+ objects.

Thanks!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:49 (19 by maintainers)

github_iconTop GitHub Comments

60reactions
joekrillcommented, Jul 9, 2021

This morning I decided to check in on zod and see if there’s been any changes (it’s been a few weeks since last I checked), and WOW! HUGE improvements since v3.1! Here is the results of my latest benchmark on our internal dataset adding Zod 3.5.1:

library parse time (ms)
zod 3.5.1 76.74
zod 3.1.0 2453.06
zod 3.0.0 2080.56
zod 1.11.17 723.87
myzod 1.11.17 26.72

So just a massive improvement. This is so awesome. Thank you so much to @milankinen, @colinhacks, and everyone who has been contributing - I really appreciate you all. This is really amazing and definitely makes zod a usable option for me again - I’ll definitely be looking into switching back.

Side note: I did run into some minor tweaks I had to make to my schema where I was using passthrough with intersection which is causing an error in 3.5.1, but that was pretty straightforward to clear up.

13reactions
tmcwcommented, Mar 17, 2022

I’ve implemented #1023, #1021, and #1022 to explore different vectors at making zod faster. I think updating the target is the highest payoff for the least amount of change, but the other PRs also have some notable perf improvements. I’ll keep looking - I think it’s possible to make zod as fast as the alternatives.

Read more comments on GitHub >

github_iconTop Results From Across the Web

JSON Parser Optimization, the Slow Path Trick - Typeable
The trick is to pretend that only the typical case exists and apply the fast algorithm (the common path), but fallback to the...
Read more >
Making beautifulsoup Parsing 10 times faster - The HFT Guy
The HTML parsing is extremely slow indeed. Looks like it's spending 7 seconds just to detect the character set of the document.
Read more >
C# developers! Are you tired of slow string parsing code?
You'd think that int.Parse and int.TryParse should have equivalent performance, but actually TryParse is 1.2 times faster!
Read more >
Improving the performance of an ANTLR parser - Strumenta
Generally speaking semantic predicates usually slow down your parser. However, if you use them wisely they can improve performance.
Read more >
Extremely slow parsing of time zone with the new java.time API
I was just migrating a module from the old java dates to the new java.time API, and noticed a huge drop in performance....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found