question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pulsar Avro Schema Enum improvement request

See original GitHub issue

Is your feature request related to a problem? Please describe. The pulsar API is not granular enough around Avro Schemas and Enums. There are currently 2 problems with the Pulsar Avro Schema resolution:

  1. Not able to set required=True for Enum types in python for an enum.
class Event(Enum):
    OPEN=0
    CLOSE=1

class Test:
     event = Event #### NO way to pass in 'required=True' for enum.

Desired Schema (with required enum):

{
    "type": "record",
    "name": "Test",
    "namespace": "com.test",
    "fields": [ {
	     "name": "event",
		"type": {
				"type": "enum",
				"name": "Event",
				"symbols": ["OPEN", "CLOSE"]
			}
             }
    ]
}

Actual Schema - generated by AvroSchema(Test)

{
    "type": "record",
    "name": "Test",
    "namespace": "com.test",
    "fields": [ {
            "name": "event",
            "type": **["null", {**
                    "type": "enum",
                    "name": "Event",
                    "symbols": ["OPEN", "CLOSE"]
                }
            ]
        }
    ]
}

required=True, was implemented for all Python types except for Enum in https://github.com/apache/pulsar/pull/3526

Was the functionality missed for Enums?

  1. In java, Building a Schema with Enums and withAlwaysAllowNull(false) is too restrictive. Suppose I have an AvroSchema that is defined by a java Class with two Enums, one that is required and one that is not required.
public class Payload{
     private Event event;
     private Direction direction;
.....
     public enum Event { OPEN, CLOSE }
     public enum Direction { UP, DOWN }
}

Create an AvroSchema<Payload> - Neither of the following will give the desired Schema

AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).**withAlwaysAllowNull(false)**.build());
//OR 
AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).**withAlwaysAllowNull(true)**.build());
//OR 
AvroSchema<Payload> avroSchema = AvroSchema.of(SchemaDefinition.<Payload>builder().withPojo(Payload.class).build());

Desired Schema: (One Nullable - event, and One Required - direction enum symbols)

{
    "type": "record",
    "name": "Payload",
    "namespace": "com.test",
    "fields": [ {
	     "name": "event",
		"type": {
				"type": "enum",
				"name": "Event",
				"symbols": ["OPEN", "CLOSE"]
			}
             },
             {
	     "name": "direction",
		"type": ["null",{
				"type": "enum",
				"name": "Direction",
				"symbols": ["UP", "DOWN"]
			}]
             }
    ]
}

Describe the solution you’d like Allow the schema to be built from a String definition of the Schema in Java and Python. OR implement some method to identify each field as required or not. Python already has this, except it does not exist for enums. The Java implementation is too restrictive, requiring all of the Class fields to be non-null or allowing all fields to be null. There is no in-between (for there to be one Enum as Required and another as Nullable).

Describe alternatives you’ve considered Using avro library instead of the wrapped pulsar functionality since it works for these cases.

Additional context Add any other context or screenshots about the feature request here.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:13 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
merlimatcommented, May 13, 2019

the problem here is that MyEnum is really of Python type Enum. That enum might be defined elsewhere in the application (or might be coming from a library), therefore we cannot count on passing required=True there.

We would have to add a c = pulsar.schema.Enum(MyEnum, required=True)

1reaction
merlimatcommented, May 13, 2019

Allow the schema to be built from a String definition of the Schema in Java and Python

Yes, this is something we definitely need to do

OR implement some method to identify each field as required or not. Python already has this, except it does not exist for enums.

How do you suggest the declaration for enums with “annotations” should look like in Python?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Safety Considerations When Using Enums in Avro Schemas
You aren't sure which Avro versions are in use, but are confident that you will never require forwards compatibility (landing data to batch...
Read more >
Slack digest for #general - 2020-10-23-Apache Mail Archives
Can't get this code working: ```from enum import Enum import pulsar from pulsar.schema import AvroSchema, String, Record class TypeEnum(Enum): TYPE_1 ...
Read more >
Release Notes - Hackolade
... Requests - Workgroup: improved handling of custom Certificate Authority - Enum and other multi-text inputs: added bulk edit functionality - JSON Schema...
Read more >
How to add an enum value to an AVRO schema in a FULL ...
The bug is in avro handling of enum default value. According to the documentation on the reader side with an old schema, we...
Read more >
[pulsar-site] branch main updated: Docs sync done from apache ...
+ +The highlight of the 2.9.3 release is introducing 30+ transaction fixes and improvements. Earlier-adoption users of Pulsar transactions have documented ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found