Avro Producer considers "avro.java.string" as part of the schema comparations
See original GitHub issueFound ArtifactNotFoundException
testing a Avro producer integrated with Apicurio registry with the following versions:
- Quarkus 1.13.6.Final
- Apicurio Registry 2.0.1.Final
- Avro 1.10.2
Given the following Avro schema definition registered into the Apicurio Registry:
{
"type": "record",
"name": "AggregateMetric",
"namespace": "com.redhat.banking.eda.model.events",
"doc": "Aggregated Metric with important information.",
"fields": [
{
"name": "name",
"type": {
"type": "string",
"avro.java.string": "String"
},
"doc": "Metric Name."
}
]
}
A producer application downloads it using the apicurio-registry-maven-plugin
and generates the Java classes with the avro-maven-plugin
with the following definition:
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>${avro.version}</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/resources/schemas</sourceDirectory>
<includes>
<include>**/*.avsc</include>
</includes>
<outputDirectory>${project.build.directory}/generated-sources/schemas</outputDirectory>
<stringType>String</stringType>
</configuration>
</execution>
</executions>
</plugin>
When the producer tries to publish a new record in Kafka using the next configuration in the application.properties
file:
# Aggregate metrics Generator
%dev.mp.messaging.outgoing.generated-aggregate-metrics.connector=smallrye-kafka
%dev.mp.messaging.outgoing.generated-aggregate-metrics.topic=eda.events.aggregate-metrics
%dev.mp.messaging.outgoing.generated-aggregate-metrics.acks=all
%dev.mp.messaging.outgoing.generated-aggregate-metrics.key.serializer=org.apache.kafka.common.serialization.IntegerSerializer
%dev.mp.messaging.outgoing.generated-aggregate-metrics.value.serializer=io.apicurio.registry.serde.avro.AvroKafkaSerializer
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.headers.enabled=true
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.auto-register=false
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.avro.encoding=JSON
I got the next exception:
2021-06-17 15:19:47,213 ERROR [io.sma.rea.mes.kafka] (vert.x-eventloop-thread-22) SRMSG18206: Unable to write to Kafka from channel generated-aggregate-metrics (topic: eda.events.aggregate-metrics): io.apicurio.registry.rest.client.exception.ArtifactNotFoundException: No artifact with ID 'AggregateMetric' in group 'com.redhat.banking.eda.model.events' was found.
However if the producer is configured with:
%dev.mp.messaging.outgoing.generated-aggregate-metrics.apicurio.registry.auto-register=true
Then the application works successfully, but in the Service Registry I have two versions of the same artifact. The second version, registered automatically by the producer, the schema has the following structure:
{
"type": "record",
"name": "AggregateMetric",
"namespace": "com.redhat.banking.eda.model.events",
"doc": "Aggregated Metric with important information.",
"fields": [
{
"name": "name",
"type": {
"type": "string",
"avro.java.string": "String"
},
"doc": "Metric Name."
},
]
}
So as the Java classes created include the new avro.java.string
type in the schema definition, Apicurio identifies as a different one.
It seems that there are some approaches to resolve it, basically aligning the Avro schema with the final schema created by the avro-maven-plugin, but I don’t know if it is a best approach.
Should Apicurio add some extra logic to skip in the schema comparation for this kind of “extra” properties?
It seems that there are some issues related with this:
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (5 by maintainers)
Top GitHub Comments
Yes I think disabling auto-register is definitely the way to go. And you may be right about there being a bug with
EXPLICIT_ARTIFACT_VERSION
. I’ve set aside some time tomorrow to look into it and get back to you.Note that you only need to worry about the Serializer configuration. The Deserializer always uses the GlobalId of the specific schema in the registry, which is included in the Kafka message by the Serializer.
This is an excellent update with information we need to diagnose the issue, thanks @rmarting - we may need a little bit of time to dig into this. I see that you are using
RecordIdStrategy
which explains part of what we’re seeing here. But I would still need to dig into why it’s not using the version you pre-register. Maybe @famartinrh knows offhand, but I would need to play with it.