question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slow re-election when elected master pod is deleted

See original GitHub issue

First of all - thank you guys for the chart!

I was playing around with the multi-node example and experienced some odd behavior. Here’s how I’m reproducing the issue.

After the multi example is deployed, open up the multi-data service to your local in one terminal:

$ kubectl port-forward service/multi-data 9200

Watch the call to /_cat/master in another terminal:

$ watch -n1 time curl -s http://localhost:9200/_cat/master?v

In a third terminal, whoever the elected master is, delete them:

$ kubectl delete pod multi-master-0

The API call in the second terminal will now hang. After 30 seconds, the request will timeout and we might see the following error for a split second:

{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

Soon after, the cluster recovers and the API call from the second window starts responding again. Here are the logs off another master node before and after the re-election:

[2019-02-17T01:49:03,736][INFO ][o.e.d.z.ZenDiscovery     ] [multi-master-1] master_left [{multi-master-0}{KZPjmKZtSf2LGV-IyvtfOg}{Es2wrbyiQdWz5CnT4V5wkA}{10.40.1.14}{10.40.1.14:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [shut_down]
[2019-02-17T01:49:03,736][WARN ][o.e.d.z.ZenDiscovery     ] [multi-master-1] master left (reason = shut_down), current nodes: nodes:
   {multi-master-0}{KZPjmKZtSf2LGV-IyvtfOg}{Es2wrbyiQdWz5CnT4V5wkA}{10.40.1.14}{10.40.1.14:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}, master
   {multi-data-0}{Ndz2WGGiSz6Y1XO1tWyWRw}{nPoZagPdRr2Tq45BPAkh_g}{10.40.2.13}{10.40.2.13:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}
   {multi-data-2}{MmkHmP6XRTibriDLazv_1A}{iS1mfA0JQ-yURqZ4-ng2zQ}{10.40.1.8}{10.40.1.8:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}
   {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}
   {multi-master-1}{uetiuhetRbasFRNLqI6ixg}{AwsDmF2aTZS1JesCE0HZ0A}{10.40.2.15}{10.40.2.15:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, local
   {multi-data-1}{Uh9aucMmS6uLRBeG5559_w}{tlxqIDoZRjCLMfMWxU0IyQ}{10.40.0.11}{10.40.0.11:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}

[2019-02-17T01:49:03,830][WARN ][o.e.t.TcpTransport       ] [multi-master-1] send message failed [channel: Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.40.1.14:36958}]
java.nio.channels.ClosedChannelException: null
        at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2019-02-17T01:49:07,107][INFO ][o.e.c.s.ClusterApplierService] [multi-master-1] detected_master {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}, reason: apply cluster state (from master [master {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} committed version [95]])
[2019-02-17T01:49:34,832][WARN ][o.e.c.NodeConnectionsService] [multi-master-1] failed to connect to node {multi-master-0}{KZPjmKZtSf2LGV-IyvtfOg}{Es2wrbyiQdWz5CnT4V5wkA}{10.40.1.14}{10.40.1.14:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [multi-master-0][10.40.1.14:9300] connect_timeout[30s]
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1576) ~[elasticsearch-6.6.0.jar:6.6.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) ~[elasticsearch-6.6.0.jar:6.6.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
[2019-02-17T01:49:37,200][WARN ][o.e.c.s.ClusterApplierService] [multi-master-1] cluster state applier task [apply cluster state (from master [master {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} committed version [95]])] took [30s] above the warn threshold of 30s
[2019-02-17T01:49:40,315][INFO ][o.e.c.s.ClusterApplierService] [multi-master-1] removed {{multi-master-0}{KZPjmKZtSf2LGV-IyvtfOg}{Es2wrbyiQdWz5CnT4V5wkA}{10.40.1.14}{10.40.1.14:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true},}, reason: apply cluster state (from master [master {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} committed version [96]])
[2019-02-17T01:49:55,945][WARN ][o.e.t.TransportService   ] [multi-master-1] Received response for a request that has timed out, sent [47893ms] ago, timed out [17870ms] ago, action [internal:discovery/zen/fd/master_ping], node [{multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], id [579]
[2019-02-17T01:50:03,784][INFO ][o.e.c.s.ClusterApplierService] [multi-master-1] added {{multi-master-0}{KZPjmKZtSf2LGV-IyvtfOg}{LqkyAGeTSUW2mtNmt49HIA}{10.40.1.15}{10.40.1.15:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true},}, reason: apply cluster state (from master [master {multi-master-2}{0PQvO9UhT--kvv2mEuJyBg}{WVb0wCALRP6KIt2oHNTmcQ}{10.40.0.20}{10.40.0.20:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} committed version [98]])

I figured Kubernetes might be killing the pods too abruptly, so I followed the instructions at https://www.elastic.co/guide/en/elasticsearch/reference/6.6/stopping-elasticsearch.html to stop Elasticsearch. Sure enough, if we kill the process from the elected master pod directly, the re-election will be quick!

Assuming mutli-master-2 is the new master:

$ kubectl exec multi-master-2 -- kill -SIGTERM 1

Notice how the API call from the second terminal only hangs for around 3 seconds this time!

Reading through the docs for the termination of pods (https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods), Kubernetes does in fact send a SIGTERM to the container, so I’m guessing deleting a pod does something more than just send a SIGTERM that Elasticsearch doesn’t like.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
DaveWHarveycommented, Mar 9, 2019

Here is what worked for me, based on the above suggestion. I mounted a script in the container that I run instead of the docker entrypoint which below. It still takes 30s to timeout the old master, but the new master seemed to be operational in 4 seconds of the old master shutdown.
Note: I had already made another fix that seems conceptually necessary to add a pre-shutdown hook on the master to delay the master termination a bit if there is not a quorum + 1 of master nodes. On a k8s rolling upgrade, a restarted master node need to be “ready” when it has opened port 9200, i.e. before it has been added to the cluster, and that allows the rolling upgrade to terminate the existing master before the new master eligible node has fully joined, and master election might not have a quorum.

if [[ -z $NODE_MASTER || “$NODE_MASTER” = “true” ]] ; then

# Run ES as a background task, and forward SIGTERM to it, then wait for it to exit
trap 'kill $(jobs -p)' SIGTERM

/usr/local/bin/docker-entrypoint.sh elasticsearch &

wait

# now keep the pod alive for 30s after ES dies so that we will refuse connections from
# the new master rather than them needing to time  out
sleep 30

else

exec /usr/local/bin/docker-entrypoint.sh elasticsearch

fi

1reaction
Crazybuscommented, May 3, 2019

This has been merged into master but not yet released. I’m leaving this until it is released and that others have also confirmed that this solution resolves the issue properly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pod Lifecycle | Kubernetes
If a Pod is scheduled to a node that then fails, the Pod is deleted; likewise, a Pod won't survive an eviction due...
Read more >
Simple Steps To Delete A Pod From A Kubernetes Node
To remove pods using a finer approach you can elect to remove pods individually. But first, using this method it is suggested that...
Read more >
How to Debug Kubernetes Pending Pods and Scheduling ...
Learn how to debug Pending pods that fail to get scheduled due to resource constraints, taints, affinity rules, and other reasons.
Read more >
How To Restart All Stolon Keeper Pods Without Deleting ...
6) Now, go ahead and run the following command, substituting the master pod name from steps 4 and 5. Note that this step...
Read more >
azuki plant - Surf Guru
As the truck crested the bridge, Skink pointed out a pod of bottle-nosed ... She said tiredly: The President was re-elected in 1980,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found