Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AMQP Body types design discussion

See original GitHub issue

AMQP data There are three classes of Amqp data types. There is also the concept of described types, which means adding a descriptor to a type. Composite and Restricted can also be Described, while Primitive can not be.

Primitive - this is any of the common primitive types such as string, boolean, byte, int, double, array, list, map etc.
Composite - these are types containing multiple fields. It is actually represented as a described list where each element of the list can be a different type.
Restricted - this applies constraints to another type. e.g. an int that is restricted to a few values can be thought of as an enum. It is also used for reserved described types, such as all of the different body types.

AMQP Message Body From http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-messaging-v1.0-os.html#type-data

The body consists of one of the following three choices: one or more data sections, one or more amqp-sequence sections, or a single amqp-value section.

Since Service Bus relies on the AMQP protocol, there has been a push from customers to expose all message body types that are included in the spec. Users may have messages coming from other sources that they want to then push into Service Bus. By supporting the spec, we will allow the SDK to be interoperable with other messaging systems. On the other hand, there is a danger in tying our API to a specific underlying protocol, as the API is an abstraction that should be able to evolve separately from the protocol.

Data is simply a described array of bytes:

<type name="data" class="restricted" source="binary" provides="section">     
<descriptor name="amqp:data:binary" code="0x00000000:0x00000075"/>
</type>

AmqpSequence is a described list of any AMPQ type. Each element in the sequence can be of a different type.

<type name="amqp-sequence" class="restricted" source="list" provides="section">     
<descriptor name="amqp:amqp-sequence:list" code="0x00000000:0x00000076"/> 
</type>

AmqpValue is a described AMQP type.

<type name="amqp-value" class="restricted" source="*" provides="section">     
<descriptor name="amqp:amqp-value:*" code="0x00000000:0x00000077"/> 
</type>

Note that each of the body types actually fall in the restricted class. AmqpValue is the most general type as you can replace data (or a list of data sections) or a sequence (or a list of sequence sections) with a single AmqpValue. An AmqpValue can be any primitive or composite (note the source attribute). The next most general is the AmqpSequence. All data sections can be represented as sequences, but sequences can also have non-binary data. The most specific is Data. Data sections can only be binary data, and a body may have a list of these.

In Java Track 1, there is support for the different body types, but for binary and sequence, it limits to a single section of each. This differs from the AMQP spec which allows for a list of each. The reason for this discrepancy is that the underlying AMQP library used by the Java SDK does not support including a list of AmqpSequence or Data. If we can make this simplification of doing away with Lists in our API for sequence and data, the API would be a lot cleaner. Interestingly, in Java, this is a runtime constraint so users would still need to construct a list of lists even though it can only have one element.

The .NET AMQP library already supports all body types:

        public static AmqpMessage Create(IEnumerable<AmqpSequence> amqpSequence);
        public static AmqpMessage Create(AmqpValue value);
        public static AmqpMessage Create(IEnumerable<Data> dataList);
        public static AmqpMessage Create(Data data);

Currently, in .NET Track 2 library we only allow setting a single Binary data body, i.e. the last overload from the list above.

Option 1a (support getting data out but not in; offer an object property) At least giving the ability to translate an AMQPMessage that populates either Sequence, or Value would be a nice addition. In Track 1, this is done through the generic GetBody extension methods (at least for Value). The method accepts a type parameter where you tell it the type you are expecting, it will then attempt to cast the underlying object property to that type. We could do something similar by just offering an object BodyObject property. This will be populated with whatever is in the Value, Sequence, or Data in whatever deserialized form we get it back from the AMQP library in. So if it is Data, it will be a byte array (or a list of byte arrays if we are going to fully support the AMQP spec). If it is a Value, it could be any object. If it is Sequence, it would be a list of any object (or a list of list of any object if fully supporting AMQP spec. The downside here is that we are essentially dropping the information we know about the shape of the data, i.e. if it is a Sequence we know it should be a list.

Option 1b (support getting data out but not in with properties for each body type) Here we would still support only getting sequence and value bodies, but we would use separate properties and not return the data on the Binary property.

It is worth noting that I’m not sure if it is feasible to address the customer issue in the way they ask for it, where the byte[] Body property should be populated with bytes regardless of the source. The issue is that if we are dealing with an arbitrary type or an arbitrary list of types, it isn’t clear how we would encode this into bytes in a way that the user would then know what to do with it. For instance, if we are dealing with a sequence, then how would we represent each individual element with just an array of bytes? Would we just smash everything together and ignore the partitions? In AMQP, this is handled by having separate fields for each element of the list. But we wouldn’t want to just expose the AMQP bytes I don’t think. If we exposed the raw AMQP bytes, then we are having our SDK take a public dependency on AMQP. From Service Bus docs,

It should also be noted that while AMQP has a powerful binary encoding model, it is tied to the AMQP messaging ecosystem and HTTP clients will have trouble decoding such payloads.

Option 2a (set and get all different body types using new composite type) If we want to go all in on the different body types, there are two broad approaches we could take:

Create a composite type that holds the different body type data.
- This is what Java did in Track 1.
- The downside is that for creating and retrieving a message body, users would now potentially have one extra step.

    public class MessageBody
    {
        public ReadOnlyMemory<byte> BinaryData { get; set; }
        public IEnumerable<IList<object>> SequenceData { get; set; }
        public object Value { get; set; }
        public static implicit operator ReadOnlyMemory<byte>(MessageBody body) => body.BinaryData;
    }

Note we can use an implicit conversion to allow users to still do something like the following, alleviating some of the concern with having to take an extra step to get at the data. byte[] body = msg.MessageBody; We wouldn’t be able to do this conversion for Value or SequenceData (due to their use of base types and interfaces, respectively). We could also offer constructor overloads for creating a ServiceBusMessage to make it so users don’t need to create a MessageBody instance themselves. Thus the experience for sending and receiving messages using a byte array, would be unchanged.

Using the byte body:

var msg = new ServiceBusMessage(Encoding.UTF8.GetBytes("my cool message"));
// get byte array using implicit conversion
byte[] body = msg.Body;

Option 2b (set and get all different body types using properties directly on ServiceBusReceivedMessage) The other approach would be to just flatten the properties out, so instead of introducing a new type, we would have the properties directly on the message. If we use a common prefix for the properties, we can have them be grouped in intellisense, e.g. BodyBinary, BodySequence, BodyValue. The downside here is that getting the data out of a received message is a bit less intuitive as it wouldn’t be obvious which property to use. With option 2a, it isn’t necessarily obvious that we have an implicit conversion to byte array, but we can document this in the MessageBody property, whereas with having the properties flattened out, users may be more overwhelmed by the different options.

Using the byte body:

var msg = new ServiceBusMessage(Encoding.UTF8.GetBytes("my cool message"));
byte[] body = msg.BodyBinary;

Issue Analytics

State:
Created 3 years ago
Comments:10 (10 by maintainers)

Top GitHub Comments

1reaction

jsquirecommented, Apr 28, 2020

I think this discussion is premature, given that there are efforts underway to include a pluggable and customizable serializer as part of the pipeline of services. Given that this is a core effort driven by the Messaging team in collaboration with the ADP team, I would expect that Service Bus would be one of the services to adopt the serialization as part of the messaging pipeline.

The goal would be to allow an arbitrary type to be specified as the body of a message which would pass through that serializer in both directions. This will enable developers with specialized needs to control the object => byte serialization (and vice-versa) using an approach designed to be cross-language compatible.

0reactions

JoshLove-msftcommented, Apr 28, 2020

The main scenario I can think of involves compatibility with other SDKs. If a message was sent using an AmqpSequence, would the custom serializer allow me to get back the body as a list of objects?

Sure. I can take a list of objects, serialize it to a number of formats (JSON, XML, Avro, BOND, ProtoBuf, etc) and represent that as bytes. This also ensures that the same payload can be sent via other protocols.

Yes, but we can’t interoperate with messages that were sent not using the custom serializer by relying on the AMQP spec. This may not be something that we want to support, but just calling it out.