question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for `dataclass`es

See original GitHub issue

Given the close similarity between msgspec.Struct and dataclasses.dataclass, am curious to what extent dataclasses can be used with msgspec and what this entails.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
jcristcommented, Dec 1, 2022

Thanks for the ping (and the nice benchmark plot). I’ve actually been revisiting this opinion, and I think you’ve convinced me that adding simple support for dataclasses is worth it. Now that we have support for TypedDict and NamedTuple objects, asking for dataclass support isn’t that far off. And compatibility with orjson is a convincing use case. Users really should use msgspec.Struct objects instead when possible (they’re much faster, and have fewer weird edge cases than dataclasses), but we can do a decent job with encoding/decoding dataclasses too.

I spent some time this evening experimenting with an implementation, and I’m pretty happy with the results. Encoding time is already much faster than what orjson provides, especially when slots=True is set:

In [1]: import msgspec, orjson

In [2]: from dataclasses import dataclass

In [3]: enc = msgspec.json.Encoder()

In [4]: @dataclass
   ...: class NoSlots:
   ...:     field_one: int
   ...:     field_two: int
   ...: 

In [5]: @dataclass(slots=True)
   ...: class Slots:
   ...:     field_one: int
   ...:     field_two: int
   ...: 

In [6]: no_slots = [NoSlots(i - 1, i + 1) for i in range(10000)]

In [7]: with_slots = [Slots(i - 1, i + 1) for i in range(10000)]

In [8]: %timeit enc.encode(no_slots)  # msgspec, no slots
561 µs ± 2.04 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [9]: %timeit orjson.dumps(no_slots)  # orjson, no slots
834 µs ± 1.69 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [10]: %timeit enc.encode(with_slots)  # msgspec, with slots
779 µs ± 20 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [11]: %timeit orjson.dumps(with_slots)  # orjson, with slots
3.71 ms ± 90.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [12]: class Struct(msgspec.Struct):
    ...:     field_one: int
    ...:     field_two: int
    ...: 

In [13]: structs = [Struct(i - 1, i + 1) for i in range(10000)]

In [14]: %timeit enc.encode(structs)  # msgspec structs
356 µs ± 307 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

I have JSON encoding done already. Decoding will take a bit more work, but I’d estimate done in the next week or two. I’ll reopen this for now as a reminder.

2reactions
jcristcommented, Dec 2, 2022

Ok, #218 is in - msgspec now supports encoding/decoding dataclasses. This came together a lot quicker than I expected, and was pretty fun to work on. They’re not as performant or featureful as Struct types, but perforamance isn’t bad (especially for encoding).

I have a few other small fixups I’d like to get in, I’d expect a release sometime next week.

Read more comments on GitHub >

github_iconTop Results From Across the Web

dataclasses — Data Classes — Python 3.11.1 documentation
Source code: Lib/dataclasses.py This module provides a decorator and functions for automatically adding generated special method s such as__init__() ...
Read more >
Data Classes in Python 3.7+ (Guide)
Data classes are one of the new features of Python 3.7. ... So far, we have not made a big fuss of the...
Read more >
Support dataclasses in TorchScript · Issue #72901 - GitHub
It seems like there are two possible ways to address this problem. First, we could add a new dataclass-like decorator, perhaps torch.
Read more >
Why don't Python 3.7 dataclasses support < > <= and >=, or do ...
For version 3.7.1 of the Transcrypt Python to JavaScript compiler I am currently using the new @dataclass ...
Read more >
Using Dataclasses - FastAPI
Using Dataclasses¶. FastAPI is built on top of Pydantic, and I have been showing you how to use Pydantic models to declare requests...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found