question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Using Map's with Avro Generated Types Generates Weird Errors

See original GitHub issue

Description of the bug I have a project that uses Avro generated types (POJO’s generated by the Avro compiler) as the type to be serialized with the producer. Everything works fine excepts if one of the fields is a map. When the field is a map and a value is set into this map, the following error is generated:

org.apache.pulsar.shade.org.apache.avro.UnresolvedUnionException: Not in union
   ["null",{"type":"array","items":{"type":"record","name":"PairCharSequenceDouble",
   "namespace":"org.apache.pulsar.shade.org.apache.avro.reflect",
   "fields":[{"name":"key","type":"string"},{"name":"value","type":"double"}]},
   "java-class":"java.util.Map"}]

Anecdotally, if the field containing a map is not set, the message is properly serialized, proving that the problem is indeed the map field. Also, the same type if implemented as a POJO class by hand (without using a Avro compiler) works fine. Which means that something happens during the message serialization when the type being serialized contains the code generated by the compiler.

To Reproduce Steps to reproduce the behavior:

  1. Define an new type like the one below:
{
    "namespace": "com.acme.something",
    "type": "record",
    "name": "AvroGenComplexType",
    "fields": [
        {
            "name": "stringField",
            "type": "string"
        },
        {
            "name": "booleanField",
            "type": "boolean"
        },
        {
            "name": "bytesField",
            "type": "bytes"
        },
        {
            "name": "intField",
            "type": "int"
        },
        {
            "name": "longField",
            "type": "long"
        },
        {
            "name": "floatField",
            "type": "float"
        },
        {
            "name": "doubleField",
            "type": "double"
        },
        {
            "name": "mapField",
            "type": [
                "null",
                {
                    "type": "map",
                    "values": "double"
                }
            ],
            "default": null
        },
        {
            "name": "innerField",
            "type": [
                "null",
                {
                    "type": "record",
                    "name": "AvroGenInnerType",
                    "fields": [
                        {
                            "name": "doubleField",
                            "type": "double"
                        },
                        {
                            "name": "arrayField",
                            "type": {
                                "type": "array",
                                "items": "string"
                            }
                        },
                        {
                            "name": "enumField",
                            "type": {
                                "type": "enum",
                                "name": "AvroGenMultipleOptions",
                                "symbols": [
                                    "FirstOption", "SecondOption",
                                    "ThirdOption", "FourthOption"
                                ],
                                "default": "FirstOption"
                            }
                        }
                    ]
                }
            ],
            "default": null
        }
    ]
}
  1. Using a Avro compiler, generate the types for Java.
  2. Write a Pulsar producer with the following code:
    protected AvroGenComplexType createAvroGenComplexType() {

        AvroGenComplexType.Builder complexType =
            AvroGenComplexType.newBuilder();

        complexType.setStringField(String.valueOf(RANDOM.nextInt()));
        complexType.setBooleanField(RANDOM.nextBoolean());
        byte[] bytes = new byte[1024];
        RANDOM.nextBytes(bytes);
        ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
        complexType.setBytesField(byteBuffer);
        complexType.setIntField(RANDOM.nextInt());
        complexType.setLongField(RANDOM.nextLong());
        complexType.setFloatField(RANDOM.nextFloat());
        complexType.setDoubleField(RANDOM.nextDouble());

        // ******** This part will generate the error ********
        Map<CharSequence, Double> mapField = new HashMap<>(1);
        mapField.put(String.valueOf(RANDOM.nextInt()), RANDOM.nextDouble());
        complexType.setMapField(mapField);
        // *****************************************************

        complexType.setInnerFieldBuilder(AvroGenInnerType.newBuilder()
            .setDoubleField(RANDOM.nextDouble())
            .setArrayField(Arrays.asList(String.valueOf(RANDOM.nextInt())))
            .setEnumField(AvroGenMultipleOptions.ThirdOption));

        return complexType.build();

    }

    public void produceAvroGenBasedMessages(String topic, int numMessages)
        throws PulsarClientException {
        Producer<AvroGenComplexType> producer = null;
        try {
            producer = pulsarClient.newProducer(
                Schema.AVRO(AvroGenComplexType.class))
                    .topic(topic)
                    .create();
            for (int i = 0; i < numMessages; i++) {
                producer.send(createAvroGenComplexType());
            }
        } finally {
            if (producer != null) {
                producer.closeAsync();
            }
        }
    }

  1. Execute the code and observe the error being thrown.

Expected behavior The code should execute completely without complaining about any errors related to the type being serialized. This is true specifically because the type being used was generated by the Avro compiler, which double-checks any compliance with the spec.

Screenshots N/A

Desktop (please complete the following information):

  • OS: Tested on Linux (Fedora 31)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
riferreicommented, Apr 9, 2020

Hi @codelipenghui,

Thank you so much for looking into this. I will be taking 2.5.1-RC for a spin for sure.

@riferrei

1reaction
codelipenghuicommented, Apr 9, 2020

@riferrei The problem is fixed by #6406, I have tested your reproduce demo(with 1.9.2 compiler) on the master branch and it passed. #6406 will release at 2.5.1 and 2.5.1-RC is out for validation. If you are interested in the validation you can take a look at the email thread https://lists.apache.org/thread.html/r7b2783d13007e621e2e880a37d808e6a9ec178ece2090e117b8c68e6%40<dev.pulsar.apache.org>, thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

apache - Exception in deserializing avro object in map reduce
Common cause of such errors is this: Your software was compiled against 1.7.6 version of avro, but in runtime, classes from older version ......
Read more >
Avro 1.11.1 - Apache Avro
The Apache Avro community is pleased to announce the release of Avro 1.11.1! All signed release artifacts, signatures and verification ...
Read more >
Avro - Go Packages
This package provides both a code generator that generates Go data structures from Avro schemas and a mapping between native Go data types...
Read more >
[ANNOUNCE] Apache Avro 1.11.1 released - The Mail Archive
The Apache Avro community is pleased to announce the release of Avro 1.11.0! ... docs for logical type annotation C++ - [AVRO-2722] Use...
Read more >
Apache Avro and Apache Spark compatibility - Waiting For Code
It's the case of this one where I try to figure out whether Apache Spark SQL Avro source is compatible with other applications...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found