Pulsar Avro Schema Enum improvement request
See original GitHub issueIs your feature request related to a problem? Please describe. The pulsar API is not granular enough around Avro Schemas and Enums. There are currently 2 problems with the Pulsar Avro Schema resolution:
- Not able to set required=True for Enum types in python for an enum.
class Event(Enum):
OPEN=0
CLOSE=1
class Test:
event = Event #### NO way to pass in 'required=True' for enum.
Desired Schema (with required enum):
{
"type": "record",
"name": "Test",
"namespace": "com.test",
"fields": [ {
"name": "event",
"type": {
"type": "enum",
"name": "Event",
"symbols": ["OPEN", "CLOSE"]
}
}
]
}
Actual Schema - generated by AvroSchema(Test)
{
"type": "record",
"name": "Test",
"namespace": "com.test",
"fields": [ {
"name": "event",
"type": **["null", {**
"type": "enum",
"name": "Event",
"symbols": ["OPEN", "CLOSE"]
}
]
}
]
}
required=True, was implemented for all Python types except for Enum in https://github.com/apache/pulsar/pull/3526
Was the functionality missed for Enums?
- In java, Building a Schema with Enums and withAlwaysAllowNull(false) is too restrictive. Suppose I have an AvroSchema that is defined by a java Class with two Enums, one that is required and one that is not required.
public class Payload{
private Event event;
private Direction direction;
.....
public enum Event { OPEN, CLOSE }
public enum Direction { UP, DOWN }
}
Create an AvroSchema<Payload> - Neither of the following will give the desired Schema
AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).**withAlwaysAllowNull(false)**.build());
//OR
AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).**withAlwaysAllowNull(true)**.build());
//OR
AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).build());
Desired Schema: (One Nullable - event, and One Required - direction enum symbols)
{
"type": "record",
"name": "Payload",
"namespace": "com.test",
"fields": [ {
"name": "event",
"type": {
"type": "enum",
"name": "Event",
"symbols": ["OPEN", "CLOSE"]
}
},
{
"name": "direction",
"type": ["null",{
"type": "enum",
"name": "Direction",
"symbols": ["UP", "DOWN"]
}]
}
]
}
Describe the solution you’d like Allow the schema to be built from a String definition of the Schema in Java and Python. OR implement some method to identify each field as required or not. Python already has this, except it does not exist for enums. The Java implementation is too restrictive, requiring all of the Class fields to be non-null or allowing all fields to be null. There is no in-between (for there to be one Enum as Required and another as Nullable).
Describe alternatives you’ve considered Using avro library instead of the wrapped pulsar functionality since it works for these cases.
Additional context Add any other context or screenshots about the feature request here.
Issue Analytics
- State:
- Created 4 years ago
- Comments:13 (6 by maintainers)
Top GitHub Comments
the problem here is that
MyEnum
is really of Python typeEnum
. That enum might be defined elsewhere in the application (or might be coming from a library), therefore we cannot count on passingrequired=True
there.We would have to add a
c = pulsar.schema.Enum(MyEnum, required=True)
Yes, this is something we definitely need to do
How do you suggest the declaration for enums with “annotations” should look like in Python?