question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Avro Producer considers "avro.java.string" as part of the schema comparations

See original GitHub issue

Found ArtifactNotFoundException testing a Avro producer integrated with Apicurio registry with the following versions:

  • Quarkus 1.13.6.Final
  • Apicurio Registry 2.0.1.Final
  • Avro 1.10.2

Given the following Avro schema definition registered into the Apicurio Registry:

{
  "type": "record",
  "name": "AggregateMetric",
  "namespace": "com.redhat.banking.eda.model.events",
  "doc": "Aggregated Metric with important information.",
  "fields": [
    {
      "name": "name",
      "type": {
        "type": "string",
        "avro.java.string": "String"
      },
      "doc": "Metric Name."
    }
  ]
}

A producer application downloads it using the apicurio-registry-maven-plugin and generates the Java classes with the avro-maven-plugin with the following definition:

            <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
                <version>${avro.version}</version>
                <executions>
                    <execution>
                        <phase>generate-sources</phase>
                        <goals>
                            <goal>schema</goal>
                        </goals>
                        <configuration>
                            <sourceDirectory>${project.basedir}/src/main/resources/schemas</sourceDirectory>
                            <includes>
                                <include>**/*.avsc</include>
                            </includes>
                            <outputDirectory>${project.build.directory}/generated-sources/schemas</outputDirectory>
                            <stringType>String</stringType>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

When the producer tries to publish a new record in Kafka using the next configuration in the application.properties file:

# Aggregate metrics Generator
%dev.mp.messaging.outgoing.generated-aggregate-metrics.connector=smallrye-kafka
%dev.mp.messaging.outgoing.generated-aggregate-metrics.topic=eda.events.aggregate-metrics
%dev.mp.messaging.outgoing.generated-aggregate-metrics.acks=all
%dev.mp.messaging.outgoing.generated-aggregate-metrics.key.serializer=org.apache.kafka.common.serialization.IntegerSerializer
%dev.mp.messaging.outgoing.generated-aggregate-metrics.value.serializer=io.apicurio.registry.serde.avro.AvroKafkaSerializer
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.headers.enabled=true
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.auto-register=false
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.avro.encoding=JSON

I got the next exception:

2021-06-17 15:19:47,213 ERROR [io.sma.rea.mes.kafka] (vert.x-eventloop-thread-22) SRMSG18206: Unable to write to Kafka from channel generated-aggregate-metrics (topic: eda.events.aggregate-metrics): io.apicurio.registry.rest.client.exception.ArtifactNotFoundException: No artifact with ID 'AggregateMetric' in group 'com.redhat.banking.eda.model.events' was found.

However if the producer is configured with:

%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.auto-register=true

Then the application works successfully, but in the Service Registry I have two versions of the same artifact. The second version, registered automatically by the producer, the schema has the following structure:

{
  "type": "record",
  "name": "AggregateMetric",
  "namespace": "com.redhat.banking.eda.model.events",
  "doc": "Aggregated Metric with important information.",
  "fields": [
    {
      "name": "name",
      "type": {
        "type": "string",
        "avro.java.string": "String"
      },
      "doc": "Metric Name."
    },
  ]
}

So as the Java classes created include the new avro.java.string type in the schema definition, Apicurio identifies as a different one.

It seems that there are some approaches to resolve it, basically aligning the Avro schema with the final schema created by the avro-maven-plugin, but I don’t know if it is a best approach.

Should Apicurio add some extra logic to skip in the schema comparation for this kind of “extra” properties?

It seems that there are some issues related with this:

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
EricWittmanncommented, Jul 1, 2021

Yes I think disabling auto-register is definitely the way to go. And you may be right about there being a bug with EXPLICIT_ARTIFACT_VERSION. I’ve set aside some time tomorrow to look into it and get back to you.

Note that you only need to worry about the Serializer configuration. The Deserializer always uses the GlobalId of the specific schema in the registry, which is included in the Kafka message by the Serializer.

1reaction
EricWittmanncommented, Jun 28, 2021

This is an excellent update with information we need to diagnose the issue, thanks @rmarting - we may need a little bit of time to dig into this. I see that you are using RecordIdStrategy which explains part of what we’re seeing here. But I would still need to dig into why it’s not using the version you pre-register. Maybe @famartinrh knows offhand, but I would need to play with it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Schema Registry considers avro.java.string as part of ... - GitHub
Yes, I am having the same issue. We have non-java consumers. The java producer, which uses the generated avro class with the SCHEMA$...
Read more >
Avro Schema - what is "avro.java.string": "String"
I've got my Kafka Streams processing configuration for AUTO_REGISTER_SCHEMAS set to true. ... { "name": "id", "type": { "type": "string", "avro.
Read more >
Java type hinting in avro-maven-plugin - El Javi
Comparing the code for POJOs generated using maven-avro-plugin two things are different. Firstly, fields like the UUID in the schema above ...
Read more >
Schema Registry Tutorial | Confluent Platform 5.4.3
This tutorial uses Maven to configure the project and dependencies. Java applications that have Kafka producers or consumers using Avro require pom.xml files...
Read more >
Safety Considerations When Using Enums in Avro Schemas
Apache Avro is commonly used in both batch and real-time data systems to describe extensible and defendable data schemas. Avro enables the creation...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found