A fork or redis-oplog for infinite scalability
See original GitHub issue@evolross @SimonSimCity @maxnowack (pls feel free to tag more people)
- We faced major issues with redis-oplog in production on AWS Elatic-Beakstalk, out-of-memory & disconnects from redis. After some research we found that redis-oplog duplicates data (2x for each observer) and re-duplicates for each observer (even if it’s the same collection and same data)
- Also, DB hits were killing us, each update required multiple hits to update the data (to avoid race conditions). This is also another major negative – not scalable
- Finally, we couldn’t read from MongoDB secondaries (given we read much more often than writes, it would result in much higher scalability)
Also Redis-oplog is slowly going into disinvestment
We create a fork (not public yet, considering my options) which does the following (more technical notes below)
- Uses a single timed cache, which is also the same place you run ‘findOne’ / ‘find’ from one – so full data consistency
- Uses redis to transmit changes to other instance caches – consistency again
- During updates, we mutate the cache and send the changed fields to the DB – instead of the current
find
,update
, thenfind
again which has 2 more hits than needed - Same for
insert
- we build the doc and send it to the othe instances - We use secondary reads in our app – there are potential race conditions in extreme case we are working on using redis as a temp cache of changes
RESULTS:
- We reduced the number of meteor instances by 3x
- Faster updates as less data is sent to redis and much fewer DB hits
- We substantially reduced the load on our DB instances – from 80% to 7% on primary
Here is the technical:
- A single data cache at the collection-level stores the full doc, the multiplexer sends to client data fields based on the projector (i.e.
fields: {}
option).collection.findOne
fetches from that cache – results in cache hits of 85-98% for our App - That cache is timed, timer resets whenever data is accessed
- Within
Mutator
we mutate what is in the cache ourselves (if it’s not there, we pull it from DB and mutate) - in other words, we don’t do anupdate
followed by afind
so usually a single db hit (update
). We also do a diff to only dispatch fields that have changed. Same thing with insert, we build the doc and dispatch it fully to redis. - We send to redis all the fields that have changed, the redis subscriber uses that data to extend the data that is stored within its cache (or pull from DB then extend data from update event). Inserts are trusted and stored in cache.
- We now use secondary DB reads which results in much higher scalability. This is why we have #3 and #4 above, we trust redis and cache over db reads to avoid race conditions. We do get race conditions every once in a while (e.g. new subs and reads), and we know where they would occur and catch them there. Otherwise, we always trust the cache vs data read from the DB
QUESTION: Is this of interest? Should we have a community version of redis-oplog that we all maintain together?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:10
- Comments:32 (19 by maintainers)
Top Results From Across the Web
Welcome to the Scalable Redis Oplog - GitHub
This clone is a major improvement for highly / infinitely scalable Meteor apps. It does have less features (e.g. no SyntheticEvent) as it...
Read more >Meteor Scaling - Redis Oplog [Status: Prod ready] - announce
Recently I've been studying in depth how publications/subscriptions actually work and what I found was a fractal of bad design for scaling. But ......
Read more >The Meteor Alternative of 2022 - BlueLibs
Meteor is a JavaScript framework started in 2012 by 4 guys from MIT. They secured a total of 22M USD in funding and...
Read more >cultofcoders:redis-oplog - Packosphere
A full re-implementation of the Meteor's MongoDB oplog tailing. This time, reactivity is controlled by the app, opening a new world into building...
Read more >sitemap_2.xml - MongoDB
... 2020-11-20T22:20:58Z https://www.mongodb.com/community/forums/t/replication-oplog-window-has-gone-below-1-hours/11935 2020-11-21T01:55:46Z ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
New repo is online: https://github.com/ramezrafla/redis-oplog
Please don’t use in production until you are sure it is working for you in a staging environment with multiple servers.
@edemaine
But you nailed it, to avoid race conditions the current redis-oplog is cumbersome and heavy in db hits
@afrokick There are some serious departures from the current approach. Some developers are happy with the way it is and I don’t want to disturb them. Also, I don’t like the swiss-army knife approach. The code was very complex to please a lot of people. There were bugs, old stuff etc.
Jack of all trades master of none 😃