question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reading Avro with specified reader and writer schemas

See original GitHub issue

Hi

In origin avro-tools when I read some data I can specify reader and writer schemas. It is related with backward/forward compatibility:

SpecificDatumReader r = new SpecificDatumReader(writer, reader);
BinaryDecoder decoder = DecoderFactory.get().directBinaryDecoder(inputStream, null);
return r.read(null, decoder);

Is it possible to specify writer and reader schema in AvroMapper?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
cowtowncodercommented, Feb 13, 2017

This is now implemented for upcoming 2.8.7. Usage is via AvroSchema, with something like:

AvroSchema writerSchema = mapper.schemaFrom(src);
AvroSchema resolving = writerSchema.withReaderSchema(
     mapper.schemaFrom(srcForReaderSchema));

and then just using given schema.

Note on performance: while there may be some overhead for schema evolution there should not be significant differences during parsing – most of the work is done during construction of “resolving” schema definition. Because of this it is strongly recommended that schemas are reused: instances are thread-safe so they can be safely shared and reused across threads.

1reaction
cowtowncodercommented, Dec 7, 2016

@pszymczyk I guess my question there is as to value of separate schemas. Since Jackson data-binding is somewhat more flexible, it often should be enough to just use the writer schema that was used for generation? And since this is always required to be present, value of second schema (to rename fields, essentially) may not be that high.

On API: the main challenge is just that of how to pass it via general Jackson API. I suspect that if it is necessary, it would be possible to modify AvroSchema wrapper (or whatever the name was) to allow secondary schema, and so maybe support this quite seamlessly? Avro parser and generators must know about handling of AvroSchema anyway, so this might be relatively simple change actually.

If you happened to have time and interest, I would be happy to help getting it integrated via PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Avro Writer's vs. Reader's Schema - Ambitious Systems
Avro always needs a schema to read encoded data. By default, there is always be atleast one schema present: the writer's schema.
Read more >
Reading an Avro Record which was written with a different ...
And I'm using a reader schema which is a superset of the writer schema (i.e., the reader schema contains all the fields of...
Read more >
Schema Evolution
The reader schema is used to deserialize a value after reading it from the store. Like the writer schema, the reader schema is...
Read more >
Avro.Specific.SpecificReader< T > Class Template Reference
Constructs a generic reader for the given schemas using the DefaultReader. If the reader's and writer's schemas are different this class performs the...
Read more >
Avro Schema Serializer and Deserializer
If used, the key of the Kafka message is often one of the primitive types mentioned above. When sending a message to a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found