Protobuf backwards compatibility check issue?
See original GitHub issueHello, thank you for sharing the Apicurio Registry, it’s a great service. I have a question about whether the logic for protobuf backwards compatibility is conforming to what protobufs actually consider backwards compatible. Particularly, with regards to the checkNoChangingFieldNames
method.
If I create a v1 message like this:
syntax = "proto3";
message FooMessage {
string foo = 1;
int32 page_number = 2;
}
And then attempt to create a v2 message in which I remove a field:
syntax = "proto3";
message FooMessage {
reserved "foo";
reserved 1;
int32 page_number = 2;
}
Or, rename the field:
syntax = "proto3";
message FooMessage {
string foobar = 1;
reserved "foo";
int32 page_number = 2;
}
… checkNoChangingFieldNames
registers these as an issue, because it compares the nameAfter
(value equals null
) to the beforeKV
(value equals foo
) in the former case, and foobar
to foo
in the latter case.
I think this could be resolved by adding to validate that although the name is null or changed , the reserved field exists (kind of like what’s done in checkNoRemovingFieldsWithoutReserve
).
However, I wonder if checkNoChangingFieldNames
is an appropriate check for protobufs at all, based on the fact they’re changeable by-design:
You can change the name of a field in the schema, since the encoded data never refers to field names, but you cannot change a field’s tag, since that would make all existing encoded data invalid.
(cite: https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/ch04.html)
Thanks for any feedback on this.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
@zenon-was-here I checked with the author of the protobuf compatibility code, and got some additional context:
So it seems we have gone the strict route, which in the case of a schema registry perhaps we don’t want.
I’m wondering if the right answer here is to add another option when configuring compatibility rules for protocol buffer schemas: strict vs. relaxed.
Thoughts @jsenko ?
I’m not against adding strict/relaxed options, but worry about possible complexity that can introduce. We would have to clearly define those categories. Maybe a better approach could be to allow users to ignore certain compatibility (sub)rules, which could be more cleanly defined, and otherwise keep it strict. We need to check if the current implementations can support this, but in case of e.g. JSON schema, I think it can be done without much difficulty.