question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PubSub Architecture

See original GitHub issue

Continuing from django/channels_redis#251 - I have a few concerns around the architecture for pubsub. While the reconnect logic is admirable, I’m not sure that it’s appropriate in pubsub because it allows for silent missed messages.

Let’s say between our _do_keepalive() loops the connection is lost, and a publisher elsewhere sends a message. RedisSingleShardConnection will silently reconnect having missed the message. In the current blocking list architecture, this would be fine because we could reconnect to the same key and continue popping items…they would simply queue up like a log (redis streams would work well). But in pubsub, those messages will be lost.

This is very relevant to our usage. In the event of a network blip or sentinel failover, the websocket consumers disconnect and the frontend attempts to gacefully recover by reconnecting websockets and by performing a full state refresh to ensure no data has been missed.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:16 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
qeternitycommented, Jun 28, 2021

@LiteWait @acu192 Absolutely you should not be using Daphne in prod (we also use Uvicorn). In terms of dropping messages between webserver and client, websockets run over tcp so you should have the same guarantees there as you do with tcp. That said, you should expect network issues everywhere. If you are using websockets as a source of truth, I think that’s a mistake as you’d need to implement some sort of 2PC on top. Distributed systems are difficult, which is why we treat redis/channels as a nice-to-have real-time sync, which we expect to break and we fall back to sync’ing via api which is backed by our postgres cluster and postgres’ decades of battle testing to overcome these exact issues.

1reaction
qeternitycommented, Jun 26, 2021

@acu192 Btw - you may have something similar in-house but to stress our infra for consistency at scale, we built this little tool - https://github.com/zumalabs/sockbasher

In dire need of some external docs but might be useful in its current form for you.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pub/Sub: A Google-Scale Messaging Service
Pub/Sub is an asynchronous messaging service designed to be highly reliable and scalable. The service is built on a core Google infrastructure component...
Read more >
Everything You Need To Know About Publish/Subscribe
The Publish/Subscribe pattern, also known as pub/sub, is an architectural design pattern that provides a framework for exchanging messages ...
Read more >
What is Pub/Sub Messaging? - Amazon AWS
In a pub/sub model, any message published to a topic is immediately received by all of the subscribers to the topic. Pub/sub messaging...
Read more >
Publish–subscribe pattern - Wikipedia
In software architecture, publish–subscribe is a messaging pattern where senders of messages ... Most messaging systems support both the pub/sub and message queue...
Read more >
Publisher Subscriber (Pub-Sub) Design Pattern
The publish subscribe pattern, sometimes known as pub sub pattern, is an architectural design pattern that enables publishers and subscribers to communicate ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found