question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A fork or redis-oplog for infinite scalability

See original GitHub issue

@evolross @SimonSimCity @maxnowack (pls feel free to tag more people)

  • We faced major issues with redis-oplog in production on AWS Elatic-Beakstalk, out-of-memory & disconnects from redis. After some research we found that redis-oplog duplicates data (2x for each observer) and re-duplicates for each observer (even if it’s the same collection and same data)
  • Also, DB hits were killing us, each update required multiple hits to update the data (to avoid race conditions). This is also another major negative – not scalable
  • Finally, we couldn’t read from MongoDB secondaries (given we read much more often than writes, it would result in much higher scalability)

Also Redis-oplog is slowly going into disinvestment

We create a fork (not public yet, considering my options) which does the following (more technical notes below)

  1. Uses a single timed cache, which is also the same place you run ‘findOne’ / ‘find’ from one – so full data consistency
  2. Uses redis to transmit changes to other instance caches – consistency again
  3. During updates, we mutate the cache and send the changed fields to the DB – instead of the current find,update, then find again which has 2 more hits than needed
  4. Same for insert - we build the doc and send it to the othe instances
  5. We use secondary reads in our app – there are potential race conditions in extreme case we are working on using redis as a temp cache of changes

RESULTS:

  • We reduced the number of meteor instances by 3x
  • Faster updates as less data is sent to redis and much fewer DB hits
  • We substantially reduced the load on our DB instances – from 80% to 7% on primary

Here is the technical:

  1. A single data cache at the collection-level stores the full doc, the multiplexer sends to client data fields based on the projector (i.e. fields: {} option). collection.findOne fetches from that cache – results in cache hits of 85-98% for our App
  2. That cache is timed, timer resets whenever data is accessed
  3. Within Mutator we mutate what is in the cache ourselves (if it’s not there, we pull it from DB and mutate) - in other words, we don’t do an update followed by a find so usually a single db hit (update). We also do a diff to only dispatch fields that have changed. Same thing with insert, we build the doc and dispatch it fully to redis.
  4. We send to redis all the fields that have changed, the redis subscriber uses that data to extend the data that is stored within its cache (or pull from DB then extend data from update event). Inserts are trusted and stored in cache.
  5. We now use secondary DB reads which results in much higher scalability. This is why we have #3 and #4 above, we trust redis and cache over db reads to avoid race conditions. We do get race conditions every once in a while (e.g. new subs and reads), and we know where they would occur and catch them there. Otherwise, we always trust the cache vs data read from the DB

QUESTION: Is this of interest? Should we have a community version of redis-oplog that we all maintain together?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:10
  • Comments:32 (19 by maintainers)

github_iconTop GitHub Comments

3reactions
ramezraflacommented, Sep 24, 2020

New repo is online: https://github.com/ramezrafla/redis-oplog

Please don’t use in production until you are sure it is working for you in a staging environment with multiple servers.

2reactions
ramezraflacommented, Sep 23, 2020

@edemaine

  1. We can add intelligence to redis via a LUA script (or a separate process that does nothing but manage data). redis is acting as a sort of edge cache in this case and becomes the ‘golden’. It will require some further work.
  2. We can also use the GO listener @SimonSimCity created and follow DB as the golden source.

But you nailed it, to avoid race conditions the current redis-oplog is cumbersome and heavy in db hits

@afrokick There are some serious departures from the current approach. Some developers are happy with the way it is and I don’t want to disturb them. Also, I don’t like the swiss-army knife approach. The code was very complex to please a lot of people. There were bugs, old stuff etc.

Jack of all trades master of none 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Welcome to the Scalable Redis Oplog - GitHub
This clone is a major improvement for highly / infinitely scalable Meteor apps. It does have less features (e.g. no SyntheticEvent) as it...
Read more >
Meteor Scaling - Redis Oplog [Status: Prod ready] - announce
Recently I've been studying in depth how publications/subscriptions actually work and what I found was a fractal of bad design for scaling. But ......
Read more >
The Meteor Alternative of 2022 - BlueLibs
Meteor is a JavaScript framework started in 2012 by 4 guys from MIT. They secured a total of 22M USD in funding and...
Read more >
cultofcoders:redis-oplog - Packosphere
A full re-implementation of the Meteor's MongoDB oplog tailing. This time, reactivity is controlled by the app, opening a new world into building...
Read more >
sitemap_2.xml - MongoDB
... 2020-11-20T22:20:58Z https://www.mongodb.com/community/forums/t/replication-oplog-window-has-gone-below-1-hours/11935 2020-11-21T01:55:46Z ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found