Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reading Avro with specified reader and writer schemas

See original GitHub issue

In origin avro-tools when I read some data I can specify reader and writer schemas. It is related with backward/forward compatibility:

SpecificDatumReader r = new SpecificDatumReader(writer, reader);
BinaryDecoder decoder = DecoderFactory.get().directBinaryDecoder(inputStream, null);
return r.read(null, decoder);

Is it possible to specify writer and reader schema in AvroMapper?

Issue Analytics

State:
Created 7 years ago
Comments:5 (4 by maintainers)

Top GitHub Comments

2reactions

cowtowncodercommented, Feb 13, 2017

This is now implemented for upcoming 2.8.7. Usage is via AvroSchema, with something like:

AvroSchema writerSchema = mapper.schemaFrom(src);
AvroSchema resolving = writerSchema.withReaderSchema(
     mapper.schemaFrom(srcForReaderSchema));

and then just using given schema.

Note on performance: while there may be some overhead for schema evolution there should not be significant differences during parsing – most of the work is done during construction of “resolving” schema definition. Because of this it is strongly recommended that schemas are reused: instances are thread-safe so they can be safely shared and reused across threads.

1reaction

cowtowncodercommented, Dec 7, 2016

@pszymczyk I guess my question there is as to value of separate schemas. Since Jackson data-binding is somewhat more flexible, it often should be enough to just use the writer schema that was used for generation? And since this is always required to be present, value of second schema (to rename fields, essentially) may not be that high.

On API: the main challenge is just that of how to pass it via general Jackson API. I suspect that if it is necessary, it would be possible to modify AvroSchema wrapper (or whatever the name was) to allow secondary schema, and so maybe support this quite seamlessly? Avro parser and generators must know about handling of AvroSchema anyway, so this might be relatively simple change actually.

If you happened to have time and interest, I would be happy to help getting it integrated via PR.

Top Results From Across the Web

Avro Writer's vs. Reader's Schema - Ambitious Systems

Avro always needs a schema to read encoded data. By default, there is always be atleast one schema present: the writer's schema.

Reading an Avro Record which was written with a different ...

And I'm using a reader schema which is a superset of the writer schema (i.e., the reader schema contains all the fields of...

Schema Evolution

The reader schema is used to deserialize a value after reading it from the store. Like the writer schema, the reader schema is...

Avro.Specific.SpecificReader< T > Class Template Reference

Constructs a generic reader for the given schemas using the DefaultReader. If the reader's and writer's schemas are different this class performs the...

Avro Schema Serializer and Deserializer

If used, the key of the Kafka message is often one of the primitive types mentioned above. When sending a message to a...