question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Consider switching to the Jawn parser on JS

See original GitHub issue

tl;dr The circe-jawn parser on JS has better semantics and in many cases out-performs the JSON.parse method currently used by circe-parser.

Benchmarks: https://armanbilge.github.io/jsoniter-scala/index-scalajs.html

Note that the above benchmark results were extracted from https://plokhotnyuk.github.io/jsoniter-scala/index-scalajs.html; all credit goes to @plokhotnyuk.


In https://github.com/circe/circe/pull/1791, the circe-jawn module was cross-built for JS. However, the circe-parser module continued to use the standard JS API JSON.parse for parsing and this was “unlikely to change” according to https://github.com/typelevel/jawn/pull/351#issuecomment-873123158. Presumably this was for performance-related reasons:

  1. JSON.parse is provided by the runtime, so no parsing code has to be bundled in applications’ size-sensitive JS artifacts.
  2. Runtimes are free to implement JSON.parse with highly-optimized native code, so it should be fast.

However, as noted in the circe docs, the use of JSON.parse also has some serious caveats:

https://github.com/circe/circe/blob/a015255b73d23505d5155deb5b09121d8d913cf6/docs/src/main/tut/parsing.md?plain=1#L61-L69

This (often surprising) difference in semantics can cause problems such as https://github.com/circe/circe/issues/393 and https://github.com/circe/circe/issues/1911#issuecomment-1069346826 and has “raised eyebrows”.

I’m surprised circe considers it better to be fast than correct, but it’s their call. Longs themselves are correct in Scala.js. Apparently it’s not correct to serialize Longs using the default mechanism of circe if you actually use their full range. Instead, using strings or a pair of Ints may work.

@sjrd in https://discord.com/channels/632150470000902164/635668814956068864/905900410517204992

Note that the circe-jawn.js parser does not have these issues and instead has semantics that match the JVM, since it is parsing from String to the Json AST with exactly the same code.

With much gratitude to @plokhotnyuk who maintains comprehensive browser benchmarks for Scala.js JSON libraries, we now have concrete numbers comparing circe-jawn.js to circe-parser.js. I’ve trimmed down @plokhotnyuk’s webpage to just the relevant benchmarks for this comparison. For easiest viewing I recommend selecting a specific browser to focus on.

https://armanbilge.github.io/jsoniter-scala/index-scalajs.html

It’s also possible to run the benchmarks yourself at: https://plokhotnyuk.github.io/jsoniter-scala/scala-2.13-fullopt.html

Here’s my rough summary/analysis from looking through the Chrome results, but please draw your own conclusions 😃

  • Jawn.js is overall very competitive with JSON.parse
  • Jawn.js consistently out-performed JSON.parse when parsing API responses (GH, Twitter, Google Maps)
  • In some benchmarks, Jawn.js is up to 4x faster than JSON.parse
  • Jawn.js was up to 5x slower when parsing certain numerics (BigDecimal, Double, Float) … but this is precisely the situation in which JSON.parse may return incorrect results, so it’s not really a fair comparison

Besides raw performance, I had previously investigated how circe-jawn affects the size of the JS artifact. In https://github.com/http4s/http4s-dom/issues/10#issuecomment-1085244829 I estimated it contributes roughly 15 KB (fullOptJS+gcc, pre-gzip). I don’t think this is a big deal, and definitely not in the Typelevel-stack Scala.js apps I’ve seen 😆


In summary, I think the circe-parser module should switch to use the Jawn parser on JS (although maybe not until the next breaking release). This gets us:

  1. Semantics that match the JVM
  2. No gotchas/surprises around parsing of numerics
  3. Competitive or improved performance in most situations

Of course, the JSON.parse-based parser should continue to be provided in the circe-scalajs module. Users who are specifically concerned about artifact size and/or performance of numerics parsing and willing to accept the shortcomings of JSON.parse, can use this parser instead.

Thanks for reading 😃

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
armanbilgecommented, Jul 13, 2022

Thanks.

I wish we had some benchmarks for nodejs as well to compare and make sure that meshes with what we are seeing with browsers.

I’ll see if I can look into this. Note that Chrome and Node.js both use the same V8 Javascript runtime, so I would expect the Chrome results to be representative for Node.js.

2reactions
zarthrosscommented, Jul 13, 2022

@armanbilge I think its the right move for 0.15. I’ll think about it for 0.14.x and talk with @zmccoy. I wish we had some benchmarks for nodejs as well to compare and make sure that meshes with what we are seeing with browsers.

As for your question about specific users, no… I don’t know any specific user which would have an issue.

@isomarcte It would be better if we could do some WASM magic or something to hyperoptimize our js json parser 😆

Read more comments on GitHub >

github_iconTop Results From Across the Web

non/jawn - Gitter
In my experience Jawn is competitive with Jackson as a parsing backend for Circe, and it's generally faster than spray-json's parser.
Read more >
Parsing JSON - circe
Circe includes a parsing module, which on the JVM is a wrapper around the Jawn JSON parser and for JavaScript uses the built-in...
Read more >
cats-parse - Typelevel
Currently it supports JVM, JS on versions 2.11, 2.12, 2.13, and 3. ... space // to drop the alphabet change the arrow side:...
Read more >
To parse or not to parse?. Working with JSON data in Scala
JSON (Javascript Object Notation) is the bedrock of data that powers ... as a wrapper around other notable packages like Jawn and Shapeless....
Read more >
node-jawn - npm
A collection of tiny, specialized javascript utilities. Use that jawn. Parse and manipulate filepaths and URIs, inspect and extrude objects, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found