Slow parsing performance
See original GitHub issueI’ve just started using this library to check that parsed JSON objects conform to the types I define. I’m really enjoying it, but I’m afraid I’ve hit a performance snag.
Everything works fine with small tests but when I ran it on a slightly bigger input file (~10MB) I noticed that it was really slow.
I’ve tried the latest version and the beta, and check
vs parse
, with similar results. The beta is faster, but I don’t see a big difference between check
and parse
.
After profiling the check
version in the beta, I’m seeing calls to .check
taking in the 500ms-1100ms range per object.
Are those numbers typical, or am I doing something wrong with the schema definitions?
My schema definitions look like:
const EntryJsonSchema = z.object({
a: z.string().optional().nullable(),
b: z.string().optional().nullable(),
id: z.string().optional().nullable(),
creation: z.string().optional().nullable(),
content: ContentJsonSchema.optional().nullable(),
labels: z.string().array().optional().nullable(),
answers: AnswerJsonSchema.array().optional().nullable(),
results: ResultJsonSchema.array()
.optional()
.nullable(),
});
const ContentJsonSchema = z.object({
id: z.string().optional().nullable(),
title: z.string().optional().nullable(),
version: z.union([z.number(), z.string()]).optional().nullable(),
});
const AnswerJsonSchema = z.object({
key: z.string().optional().nullable(),
value: z.any().optional().nullable(),
});
const ResultJsonSchema = z.object({
key: z.string().optional().nullable(),
value: z.any().optional().nullable(),
});
I’m really hoping that there is a way to speed it up, as this is too expensive for the use case where I’ll have to process files with 100k+ objects.
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:49 (19 by maintainers)
Top GitHub Comments
This morning I decided to check in on zod and see if there’s been any changes (it’s been a few weeks since last I checked), and WOW! HUGE improvements since v3.1! Here is the results of my latest benchmark on our internal dataset adding Zod 3.5.1:
parse
time (ms)So just a massive improvement. This is so awesome. Thank you so much to @milankinen, @colinhacks, and everyone who has been contributing - I really appreciate you all. This is really amazing and definitely makes zod a usable option for me again - I’ll definitely be looking into switching back.
Side note: I did run into some minor tweaks I had to make to my schema where I was using
passthrough
withintersection
which is causing an error in 3.5.1, but that was pretty straightforward to clear up.I’ve implemented #1023, #1021, and #1022 to explore different vectors at making zod faster. I think updating the
target
is the highest payoff for the least amount of change, but the other PRs also have some notable perf improvements. I’ll keep looking - I think it’s possible to make zod as fast as the alternatives.