Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

possible simpler design?

See original GitHub issue

Hello @donbeave @kikov79 @harai @cvallance @ericln ,

Awesome project… I’ve been looking at setting up MongoDB with replica sets on Kubernetes, and I found this project, so I’m pretty stoked.

Yet, I’m a little weary of deploying (yet another) NodeJS application in my cluster… It seems a little overkill to run this as a sidecar permanently, so I wanted to pick your brain(s) about another approach:

you’ve obviously spent a bunch of time on this, so you might just be able to point the flaws in the concept right away before I spend too much time working on it.

Here are the assumptions:

A replicaSets need to be initialized on the master node, and master node only.
MongoDB pods are deployed with a ReplicationController, so if a pod dies, it gets replaced, so running a script on post start can check for pods that are gone and pods that came in to reconfigure the master.
a Headless service provides a list of IPs for the pods references by the mongoDB label.

I’m thinking all this can easily be done with a shell script:

Looking up pods with the apiserver is just a curl command. I use this all the time and use jq to parse the JSON output.
Finding the master (or lack thereof on bootstrap) can be done with the mongo shell as command line.

So the idea is the following: on container lifecycle postStart hook, run a shell script in the pod that will:

lookup IPs of all mongoDB pods.
run mongo shell rs.status() command on each mongo pod to find if there is a master, or an initialized replicaSet already. -> if there is a master, list all pods in the replica set from the mongo shell rs.conf() command on the master, and add/remove pods according to what the Kubernetes apiserver pod list provides. -> if there is no master, rs.initiate() the current pod as master, and add the other pods as replicas.

Obviously, this could cause problems if a ReplicationController starts many pods at once on bootstrap (race condition to become the master and create the replicaSet) One could acquire a lock by setting a key in etcd, but that makes things more complicated. If the process is assumed to bootstrap one mongo pod first, then scale, that seems very reasonable to me.

When a pod dies, the RC will restart it, and reconfig will happen on postStart, so there doesn’t seem to need to be a worker running at all times checking the state of the cluster: if a node is gone, it will reconnect, rejoin, and clear up its old IP. Left over removed pods may only be a problem on scaling down. A similar reconfig script could be ran on the container lifecycle preStop hook.

What are your thoughts?

Cheers