Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slow parsing performance

See original GitHub issue

I’ve just started using this library to check that parsed JSON objects conform to the types I define. I’m really enjoying it, but I’m afraid I’ve hit a performance snag.

Everything works fine with small tests but when I ran it on a slightly bigger input file (~10MB) I noticed that it was really slow.

I’ve tried the latest version and the beta, and check vs parse, with similar results. The beta is faster, but I don’t see a big difference between check and parse.

After profiling the check version in the beta, I’m seeing calls to .check taking in the 500ms-1100ms range per object.

Are those numbers typical, or am I doing something wrong with the schema definitions?

My schema definitions look like:

const EntryJsonSchema = z.object({
  a: z.string().optional().nullable(),
  b: z.string().optional().nullable(),
  id: z.string().optional().nullable(),
  creation: z.string().optional().nullable(),
  content: ContentJsonSchema.optional().nullable(),
  labels: z.string().array().optional().nullable(),
  answers: AnswerJsonSchema.array().optional().nullable(),
  results: ResultJsonSchema.array()
    .optional()
    .nullable(),
});

const ContentJsonSchema = z.object({
  id: z.string().optional().nullable(),
  title: z.string().optional().nullable(),
  version: z.union([z.number(), z.string()]).optional().nullable(),
});

const AnswerJsonSchema = z.object({
  key: z.string().optional().nullable(),
  value: z.any().optional().nullable(),
});

const ResultJsonSchema = z.object({
  key: z.string().optional().nullable(),
  value: z.any().optional().nullable(),
});

I’m really hoping that there is a way to speed it up, as this is too expensive for the use case where I’ll have to process files with 100k+ objects.

Thanks!

Issue Analytics

State:
Created 3 years ago
Reactions:3
Comments:49 (19 by maintainers)

Top GitHub Comments

60reactions

joekrillcommented, Jul 9, 2021

This morning I decided to check in on zod and see if there’s been any changes (it’s been a few weeks since last I checked), and WOW! HUGE improvements since v3.1! Here is the results of my latest benchmark on our internal dataset adding Zod 3.5.1:

library	`parse` time (ms)
zod 3.5.1	76.74
zod 3.1.0	2453.06
zod 3.0.0	2080.56
zod 1.11.17	723.87
myzod 1.11.17	26.72

So just a massive improvement. This is so awesome. Thank you so much to @milankinen, @colinhacks, and everyone who has been contributing - I really appreciate you all. This is really amazing and definitely makes zod a usable option for me again - I’ll definitely be looking into switching back.

Side note: I did run into some minor tweaks I had to make to my schema where I was using passthrough with intersection which is causing an error in 3.5.1, but that was pretty straightforward to clear up.

13reactions

tmcwcommented, Mar 17, 2022

I’ve implemented #1023, #1021, and #1022 to explore different vectors at making zod faster. I think updating the target is the highest payoff for the least amount of change, but the other PRs also have some notable perf improvements. I’ll keep looking - I think it’s possible to make zod as fast as the alternatives.