Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dealing with Message and Request Size Limits

See original GitHub issue

A lot of existing consumers, message queues and middleware routers have limited the maximum size of the message they’re accepting.

Some examples I’ve encountered:

AWS Lambda: 256 KB (Event/asynchronous invocation), 6 MB (RequestResponse/synchronous invocation)
AWS Kinesis: 1 MB
AWS SNS & SQS: 256 KB
Azure EventGrid: 64 KB
Azure Functions: 15 MB
Azure ServiceBus: 256 KB
Google Cloud Functions: 10 MB
Google Cloud PubSub: 10 MB
Kafka: 1 MB (default, configurable)
MQTT: 256 MB
IronMQ: 64 KB or 256 KB

There are often further limitations, e.g. how many attributes can be set per message, the size of a single attribute or of all attributes together.

In order to ensure interoperability, I believe the spec should address this issue.

I am currently implementing a CouldEvents producer, and I’m struggling with how to approach it. When I integrated with a specific Message Queue in the past, I knew the limit up front (mostly… edge case: A message of 200 KB can be put into SNS, but a Lambda listener can not consume it). If a Message exceeds that limit, I won’t deliver the data and instead instruct the consumer to follow the “claim check” pattern (basically, retrieve the Message data via another request).

This problem is also relevant for a middleware/event router. It may not want to accept messages that it can not deliver, or it may want to implement the claim check pattern to mediate between producers and consumers.

I have two proposals (that are not exclusive):

Introduce a static size limit

With interoperability of existing systems in mind, a limit is picked (e.g. 128 KB for data, no context attribute can be larger than 1 KB, a maximum of 50 extension-entries allowed).

Consumers (including middlewares) MUST support messages within these limits, and MAY accept larger messages. Producers MUST support only creating messages within these limits, but MAY be configurable to create larger messages.

Alternatively, several sizes could be chosen (e.g. 128 KB, 256 KB, 1 MB, 10 MB), and everyone MUST support at least the smallest size. Producers MAY be configurable to create larger messages.

Remark: I’d prefer a dynamic size limit (the consumer tells the producer the message is too large, e.g. via a 413 Payload Too Large response), but I don’t think it is feasible for a middleware that buffers messages unless the middleware can “shrink” a message (see next proposal). It would also mean more effort for all implementations, if not combined with a lowest, guaranteed limit.

First-class support for claim check pattern

In a scenario with just a producer and a consumer, the claim check pattern is easy to implement: The consumer knows that for an eventType, data exists. If in the received message the data attribute is empty, it knows from the context how to request the data elsewhere (e.g. using the eventId and/or source field: http://example.org/{eventId}.json). Downsides: The consumer is quite strongly coupled, the retrieval has to be re-programmed for each producer, and for a newly introduced eventType, it may not know if data exists or not.

However, in a scenario with a middleware, it gets harder to implement. If the middleware must remove the data, it can not generically tell the consumer where to retrieve it (e.g. http://middleware.com/{eventId}.json).

The proposal is to add the URI where the data can be retrieved to the context attributes. A producer could set both the URI and the data (in which case a middleware can just drop the data), or a middleware can insert it later on, if needed. The URI MAY not be publicly accessible, in this case the consumer has to know how to authorize.

That would also allow a SDK to simplify implementing a consumer, and retrieve the data in the background.

I am happy to receive feedback on how to approach this in v0.1. If there is some consensus that either proposal should be explored more deeply, I’m willing to work on (probably separate?) PRs for them.

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

rockymaddencommented, Jun 28, 2018

As another data point, I’ll mention that NATS/NATS Streaming has a 1MB default much like Kafka. Like Kafka, it can also be configured. Both our internal FaaS, which will be supporting CloudEvents, and Fission use NATS Streaming as a messaging/queue backbone.

0reactions

cneijenhuiscommented, Jun 7, 2019

Addressed with #405

Top Results From Across the Web

Configure message size limits for a mailbox | Microsoft Learn

Sent messages: To set a maximum size for messages sent by this user, select the Maximum message size (KB) check box and type...

Can HTTP POST be limitless? - Stack Overflow

The HTTP protocol does not specify a limit. · The POST method allows sending far more data than the GET method, which is...

Is there a maximum size for content of an HTTP POST?

The HTTP specification doesn't impose a specific size limit for posts. They will usually be limited by either the web server or the...

How do we determine Maximum Message Size supported by ...

using the 'Message Size Limit' assertion, you can specify either the Request,Response,custom contexts and enforce a size limit here as well. hope this...

Set the message size limit on the Router or Message Processor

To prevent memory issues in Edge, message payload size on the Router and Message Processor is restricted to 10MB. Exceeding those sizes results...