Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Protobuf backwards compatibility check issue?

See original GitHub issue

Hello, thank you for sharing the Apicurio Registry, it’s a great service. I have a question about whether the logic for protobuf backwards compatibility is conforming to what protobufs actually consider backwards compatible. Particularly, with regards to the checkNoChangingFieldNames method.

If I create a v1 message like this:

syntax = "proto3";

message FooMessage {
  string foo = 1;
  int32 page_number = 2;
}

And then attempt to create a v2 message in which I remove a field:

syntax = "proto3";

message FooMessage {
  reserved "foo";
  reserved 1;
  int32 page_number = 2;
}

Or, rename the field:

syntax = "proto3";

message FooMessage {
  string foobar = 1;
  reserved "foo";
  int32 page_number = 2;
}

… checkNoChangingFieldNames registers these as an issue, because it compares the nameAfter (value equals null) to the beforeKV (value equals foo) in the former case, and foobar to foo in the latter case.

I think this could be resolved by adding to validate that although the name is null or changed , the reserved field exists (kind of like what’s done in checkNoRemovingFieldsWithoutReserve).

However, I wonder if checkNoChangingFieldNames is an appropriate check for protobufs at all, based on the fact they’re changeable by-design:

You can change the name of a field in the schema, since the encoded data never refers to field names, but you cannot change a field’s tag, since that would make all existing encoded data invalid.

(cite: https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/ch04.html)

Thanks for any feedback on this.

Issue Analytics

State:
Created 3 years ago
Comments:5 (4 by maintainers)

Top GitHub Comments

1reaction

EricWittmanncommented, Mar 16, 2021

@zenon-was-here I checked with the author of the protobuf compatibility code, and got some additional context:

The protocol buffers compatibility rules originate from https://github.com/nilslice/protolock which unlike my code has the notion of strict vs. relaxed compatibility modes.

Strict has rules that are in some cases really perhaps better thought of as paranoid. There could be cases where e.g. code transforms protobuf data to another format and then operates on it, at which point the field names matter. However, the more important motivation at the time was the requirement for compatibility of the registry with the way the Confluent one behaved.

So it seems we have gone the strict route, which in the case of a schema registry perhaps we don’t want.

I’m wondering if the right answer here is to add another option when configuring compatibility rules for protocol buffer schemas: strict vs. relaxed.

Thoughts @jsenko ?

0reactions

jsenkocommented, Mar 17, 2021

I’m not against adding strict/relaxed options, but worry about possible complexity that can introduce. We would have to clearly define those categories. Maybe a better approach could be to allow users to ignore certain compatibility (sub)rules, which could be more cleanly defined, and otherwise keep it strict. We need to check if the current implementations can support this, but in case of e.g. JSON schema, I think it can be done without much difficulty.