Dealing with Message and Request Size Limits
See original GitHub issueA lot of existing consumers, message queues and middleware routers have limited the maximum size of the message they’re accepting.
Some examples I’ve encountered:
- AWS Lambda: 256 KB (Event/asynchronous invocation), 6 MB (RequestResponse/synchronous invocation)
- AWS Kinesis: 1 MB
- AWS SNS & SQS: 256 KB
- Azure EventGrid: 64 KB
- Azure Functions: 15 MB
- Azure ServiceBus: 256 KB
- Google Cloud Functions: 10 MB
- Google Cloud PubSub: 10 MB
- Kafka: 1 MB (default, configurable)
- MQTT: 256 MB
- IronMQ: 64 KB or 256 KB
There are often further limitations, e.g. how many attributes can be set per message, the size of a single attribute or of all attributes together.
In order to ensure interoperability, I believe the spec should address this issue.
I am currently implementing a CouldEvents producer, and I’m struggling with how to approach it. When I integrated with a specific Message Queue in the past, I knew the limit up front (mostly… edge case: A message of 200 KB can be put into SNS, but a Lambda listener can not consume it). If a Message exceeds that limit, I won’t deliver the data and instead instruct the consumer to follow the “claim check” pattern (basically, retrieve the Message data via another request).
This problem is also relevant for a middleware/event router. It may not want to accept messages that it can not deliver, or it may want to implement the claim check pattern to mediate between producers and consumers.
I have two proposals (that are not exclusive):
Introduce a static size limit
With interoperability of existing systems in mind, a limit is picked (e.g. 128 KB for data, no context attribute can be larger than 1 KB, a maximum of 50 extension-entries allowed).
Consumers (including middlewares) MUST support messages within these limits, and MAY accept larger messages. Producers MUST support only creating messages within these limits, but MAY be configurable to create larger messages.
Alternatively, several sizes could be chosen (e.g. 128 KB, 256 KB, 1 MB, 10 MB), and everyone MUST support at least the smallest size. Producers MAY be configurable to create larger messages.
Remark: I’d prefer a dynamic size limit (the consumer tells the producer the message is too large, e.g. via a 413 Payload Too Large
response), but I don’t think it is feasible for a middleware that buffers messages unless the middleware can “shrink” a message (see next proposal). It would also mean more effort for all implementations, if not combined with a lowest, guaranteed limit.
First-class support for claim check pattern
In a scenario with just a producer and a consumer, the claim check pattern is easy to implement: The consumer knows that for an eventType
, data
exists. If in the received message the data
attribute is empty, it knows from the context how to request the data
elsewhere (e.g. using the eventId
and/or source
field: http://example.org/{eventId}.json
). Downsides: The consumer is quite strongly coupled, the retrieval has to be re-programmed for each producer, and for a newly introduced eventType
, it may not know if data
exists or not.
However, in a scenario with a middleware, it gets harder to implement. If the middleware must remove the data
, it can not generically tell the consumer where to retrieve it (e.g. http://middleware.com/{eventId}.json
).
The proposal is to add the URI where the data
can be retrieved to the context attributes. A producer could set both the URI and the data
(in which case a middleware can just drop the data
), or a middleware can insert it later on, if needed. The URI MAY not be publicly accessible, in this case the consumer has to know how to authorize.
That would also allow a SDK to simplify implementing a consumer, and retrieve the data
in the background.
I am happy to receive feedback on how to approach this in v0.1. If there is some consensus that either proposal should be explored more deeply, I’m willing to work on (probably separate?) PRs for them.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:9 (4 by maintainers)
As another data point, I’ll mention that NATS/NATS Streaming has a 1MB default much like Kafka. Like Kafka, it can also be configured. Both our internal FaaS, which will be supporting CloudEvents, and Fission use NATS Streaming as a messaging/queue backbone.
Addressed with #405