question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Same schema registered to multiple subjects - multiple global IDs - different behavior compared to Confluent Schema Registry

See original GitHub issue

Consider the following example schema:

{"schema": "{\"type\": \"record\", \"name\": \"ID\", \"namespace\": \"com.example\", \"fields\": [{\"doc\": \"ID\", \"name\": \"id\", \"type\": \"int\"}]}"}

If I register this with Confluent Schema Registry, on subject “test”, I get the following result:

$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test/versions' -H 'Content-Type: application/vnd.schemaregistry.v1+json' -d@id-schema-namespaced.json ; echo
{"id":1}

Now, if I re-register it but on a different subject “test2”:

$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test2/versions' -H 'Content-Type: application/vnd.schemaregistry.v1+json' -d@id-schema-namespaced.json ; echo
{"id":1}

We can see that the schema is now registered on two different subjects:

$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test/versions' ; echo
[1]
$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test/versions/1' ; echo
{"subject":"test","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"ID\",\"namespace\":\"com.example\",\"fields\":[{\"name\":\"id\",\"type\":\"int\",\"doc\":\"ID\"}]}"}
$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test2/versions' ; echo
[1]
$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8081/subjects/test2/versions/1' ; echo
{"subject":"test2","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"ID\",\"namespace\":\"com.example\",\"fields\":[{\"name\":\"id\",\"type\":\"int\",\"doc\":\"ID\"}]}"}

As we can see - even though the schema is registered on two different subjects, it has the same Global ID: 1

However, Apicurio Registry 1.3.0.Final, and I’ve tested this with both the asyncmem and the streams backend, behaves differently:

$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8080/api/ccompat/subjects/test/versions' -H 'Content-Type: application/vnd.schemaregistry.v1+json' -d@id-schema-namespaced.json ; echo
{"id":1}
$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8080/api/ccompat/subjects/test2/versions' -H 'Content-Type: application/vnd.schemaregistry.v1+json' -d@id-schema-namespaced.json ; echo
{"id":2}

We can see already that the global ID is not the same, and this can be verified by using the versions endpoint:

$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8080/api/ccompat/subjects/test/versions/1' ; echo
{"id":1,"subject":"test","version":1,"schema":"{\"type\": \"record\", \"name\": \"ID\", \"namespace\": \"com.example\", \"fields\": [{\"doc\": \"ID\", \"name\": \"id\", \"type\": \"int\"}]}"}
$ curl  -H 'Accept: application/vnd.schemaregistry.v1+json' 'http://localhost:8080/api/ccompat/subjects/test2/versions/1' ; echo
{"id":2,"subject":"test2","version":1,"schema":"{\"type\": \"record\", \"name\": \"ID\", \"namespace\": \"com.example\", \"fields\": [{\"doc\": \"ID\", \"name\": \"id\", \"type\": \"int\"}]}"}

Is this a bug, or by design? If it’s by design, why?

For me, having an ID that is the same for a specific schema, regardless of subject is part of the beauty of using a Schema Registry, as it gives me flexibility in how I use schemas and route data.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
EricWittmanncommented, Mar 1, 2021

Update: this will be addressed in version 2.0.

@carlesarnal and @famartinrh - can you guys collaborate on making sure this is fixed? I think there are two things to do here:

  1. The ccompat API implementation needs to be updated to use contentId instead of globalId (I believe in all cases) ( @carlesarnal )
  2. The Apicurio Registry Serdes classes needs a configuration option to read/write the contentId instead of the globalId ( @famartinrh )
1reaction
EricWittmanncommented, Sep 9, 2020

We had a quick internal discussion about this today and realized that we don’t necessarily have to implement this functionality for the Apicurio Registry API - we just need to implement it for the Confluent Compatibility API. Thanks for that suggestion @jsenko

I need to think through how we might be able to provide a separate “globalId” for the Confluent compatibility layer that behaves the way you are expecting and what implications that might have for other parts of the application and other use-cases. For example, does this approach break the ability to mix and match Apicurio serdes classes with Confluent serdes classes?

So this is just an idea at this point, but if we can make it work then we’ll have an option that will allow us to support the feature without doing a 2.0 release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Formats, Serializers, and Deserializers
Schema Registry supports multiple formats at the same time. For example, you can have Avro schemas in one subject and Protobuf schemas in...
Read more >
Transferring Avro Data Across Schema Registries with Kafka ...
Each subject can contain multiple versions of a schema, and each version of a schema has a global ID within Schema Registry.
Read more >
Confluent Cloud Schema Registry Tutorial
This tutorial provides a step-by-step workflow for using Confluent Cloud Schema Registry. You will learn how to enable client applications to read and...
Read more >
Multiple Event Types in the Same Kafka Topic - Revisited
Schema Registry now supports schema references in Confluent Platform 5.5, and this blog post presents an alternative means of putting several ...
Read more >
Manage Schemas for Topics in Control Center
Use the Schema Registry feature in Control Center to manage Confluent Platform topic schemas. You can: create, edit, and view schemas; compare schema...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found