Any guidance on implementing serialization from/deserialization to custom classes?
See original GitHub issueIt’s often perfectly fine to serialize dicts/lists/etc into avro, and deserialize avro into dicts/lists/etc. However sometimes it would be nice to use custom classes to hold the data while it’s in Python.
This is possible in the standard python JSON module, for example, using the object_hook
argument. Something similar should be possible in avro. For deserializing, a function could take schema
and object
as arguments where object
is whatever dict/list/etc was parsed, and schema
is the avro schema of that object. The function body could then make a decision on whether to return the default object
or instantiate a new object to return.
I’m curious how I might begin to implement something like this using the fastavro library. If anyone has any guidance, I’d really appreciate it.
Issue Analytics
- State:
- Created 5 years ago
- Comments:20 (2 by maintainers)
@redsk Cool! I’m interested to take a look at it, so hopefully I can do that soon.
Hi @scottbelden , I’ve written an open source library to generate python classes from avro schemas. It’s available at https://gitlab.com/Jaumo/pyavro-gen . It also includes some bits to integrate it with fastavro, Kafka and the Schema Registry (as this is required for my use case).
I also used dataclasses (python 3.7) but I guess it would be possible to consider support for older versions as well.
Please let me know if you’re interested.