Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PubSub Architecture

See original GitHub issue

Continuing from django/channels_redis#251 - I have a few concerns around the architecture for pubsub. While the reconnect logic is admirable, I’m not sure that it’s appropriate in pubsub because it allows for silent missed messages.

Let’s say between our _do_keepalive() loops the connection is lost, and a publisher elsewhere sends a message. RedisSingleShardConnection will silently reconnect having missed the message. In the current blocking list architecture, this would be fine because we could reconnect to the same key and continue popping items…they would simply queue up like a log (redis streams would work well). But in pubsub, those messages will be lost.

This is very relevant to our usage. In the event of a network blip or sentinel failover, the websocket consumers disconnect and the frontend attempts to gacefully recover by reconnecting websockets and by performing a full state refresh to ensure no data has been missed.

Issue Analytics

State:
Created 2 years ago
Comments:16 (3 by maintainers)

Top GitHub Comments

1reaction

qeternitycommented, Jun 28, 2021

@LiteWait @acu192 Absolutely you should not be using Daphne in prod (we also use Uvicorn). In terms of dropping messages between webserver and client, websockets run over tcp so you should have the same guarantees there as you do with tcp. That said, you should expect network issues everywhere. If you are using websockets as a source of truth, I think that’s a mistake as you’d need to implement some sort of 2PC on top. Distributed systems are difficult, which is why we treat redis/channels as a nice-to-have real-time sync, which we expect to break and we fall back to sync’ing via api which is backed by our postgres cluster and postgres’ decades of battle testing to overcome these exact issues.

1reaction

qeternitycommented, Jun 26, 2021

@acu192 Btw - you may have something similar in-house but to stress our infra for consistency at scale, we built this little tool - https://github.com/zumalabs/sockbasher

In dire need of some external docs but might be useful in its current form for you.

Top Results From Across the Web

Pub/Sub: A Google-Scale Messaging Service

Pub/Sub is an asynchronous messaging service designed to be highly reliable and scalable. The service is built on a core Google infrastructure component...

Everything You Need To Know About Publish/Subscribe

The Publish/Subscribe pattern, also known as pub/sub, is an architectural design pattern that provides a framework for exchanging messages ...

What is Pub/Sub Messaging? - Amazon AWS

In a pub/sub model, any message published to a topic is immediately received by all of the subscribers to the topic. Pub/sub messaging...

Publish–subscribe pattern - Wikipedia

In software architecture, publish–subscribe is a messaging pattern where senders of messages ... Most messaging systems support both the pub/sub and message queue...

Publisher Subscriber (Pub-Sub) Design Pattern

The publish subscribe pattern, sometimes known as pub sub pattern, is an architectural design pattern that enables publishers and subscribers to communicate ...