Readiness probe failed: Waiting for elasticsearch cluster to become ready
See original GitHub issueChart version: https://github.com/elastic/helm-charts/tree/7.9
Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.7", GitCommit:"169db3bff4b5fb7722e967c5b6356713f05f15ed", GitTreeState:"clean", BuildDate:"2020-04-03T16:14:09Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes provider: Azure Kubernetes Cluster
Helm Version:
version.BuildInfo{Version:"v3.4.1", GitCommit:"c4e74854886b2efe3321e185578e6db9be0a6e29", GitTreeState:"clean", GoVersion:"go1.14.11"}
helm get release
output
Output of helm get release
NAME: dummy-elasticsearch
LAST DEPLOYED: Wed Nov 18 10:51:41 2020
NAMESPACE: dummy-elasticsearch
STATUS: deployed
REVISION: 1
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=2s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.9.4-SNAPSHOT
ingress:
annotations:
kubernetes.io/ingress.class: nginx
enabled: true
hosts:
- dummy-elasticsearch.eastus2.cloudapp.azure.com
path: /
tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 1
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
annotations: {}
enabled: true
labels:
enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
podSecurityPolicy:
create: false
name: ""
spec:
fsGroup:
rule: RunAsAny
privileged: true
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
create: false
serviceAccountAnnotations: {}
serviceAccountName: ""
readinessProbe:
failureThreshold: 4
initialDelaySeconds: 20
periodSeconds: 20
successThreshold: 3
timeoutSeconds: 10
replicas: 1
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
roles:
data: "true"
ingest: "true"
master: "true"
remote_cluster_client: "true"
schedulerName: ""
secretMounts: []
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
service:
annotations: {}
externalTrafficPolicy: ""
httpPortName: http
labels: {}
labelsHeadless: {}
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePort: ""
transportPortName: transport
type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=2s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.9.4-SNAPSHOT
ingress:
annotations:
kubernetes.io/ingress.class: nginx
enabled: true
hosts:
- dummy-elasticsearch.eastus2.cloudapp.azure.com
path: /
tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 1
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
annotations: {}
enabled: true
labels:
enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
podSecurityPolicy:
create: false
name: ""
spec:
fsGroup:
rule: RunAsAny
privileged: true
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
create: false
serviceAccountAnnotations: {}
serviceAccountName: ""
readinessProbe:
failureThreshold: 4
initialDelaySeconds: 20
periodSeconds: 20
successThreshold: 3
timeoutSeconds: 10
replicas: 1
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
roles:
data: "true"
ingest: "true"
master: "true"
remote_cluster_client: "true"
schedulerName: ""
secretMounts: []
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
service:
annotations: {}
externalTrafficPolicy: ""
httpPortName: http
labels: {}
labelsHeadless: {}
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePort: ""
transportPortName: transport
type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
HOOKS:
---
# Source: elasticsearch/templates/test/test-elasticsearch-health.yaml
apiVersion: v1
kind: Pod
metadata:
name: "dummy-elasticsearch-yedjv-test"
annotations:
"helm.sh/hook": test-success
spec:
securityContext:
fsGroup: 1000
runAsUser: 1000
containers:
- name: "dummy-elasticsearch-piqwt-test"
image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
command:
- "sh"
- "-c"
- |
#!/usr/bin/env bash -e
curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=2s'
restartPolicy: Never
MANIFEST:
---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: "elasticsearch-master-pdb"
spec:
maxUnavailable: 1
selector:
matchLabels:
app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "dummy-elasticsearch"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
{}
spec:
type: ClusterIP
selector:
release: "dummy-elasticsearch"
chart: "elasticsearch"
app: "elasticsearch-master"
ports:
- name: http
protocol: TCP
port: 9200
- name: transport
protocol: TCP
port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master-headless
labels:
heritage: "Helm"
release: "dummy-elasticsearch"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
# Create endpoints also if the related pod isn't ready
publishNotReadyAddresses: true
selector:
app: "elasticsearch-master"
ports:
- name: http
port: 9200
- name: transport
port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "dummy-elasticsearch"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
esMajorVersion: "7"
spec:
serviceName: elasticsearch-master-headless
selector:
matchLabels:
app: "elasticsearch-master"
replicas: 1
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
name: elasticsearch-master
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
template:
metadata:
name: "elasticsearch-master"
labels:
heritage: "Helm"
release: "dummy-elasticsearch"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
spec:
securityContext:
fsGroup: 1000
runAsUser: 1000
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "elasticsearch-master"
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 120
volumes:
enableServiceLinks: true
initContainers:
- name: configure-sysctl
securityContext:
runAsUser: 0
privileged: true
image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
command: ["sysctl", "-w", "vm.max_map_count=262144"]
resources:
{}
containers:
- name: "elasticsearch"
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
readinessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=2s" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "7" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=2s" )'
if http "/_cluster/health?wait_for_status=green&timeout=2s" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=2s" )'
exit 1
fi
fi
failureThreshold: 4
initialDelaySeconds: 20
periodSeconds: 20
successThreshold: 3
timeoutSeconds: 10
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
env:
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: "elasticsearch-master-0,"
- name: discovery.seed_hosts
value: "elasticsearch-master-headless"
- name: cluster.name
value: "elasticsearch"
- name: network.host
value: "0.0.0.0"
- name: ES_JAVA_OPTS
value: "-Xmx1g -Xms1g"
- name: node.data
value: "true"
- name: node.ingest
value: "true"
- name: node.master
value: "true"
- name: node.remote_cluster_client
value: "true"
volumeMounts:
- name: "elasticsearch-master"
mountPath: /usr/share/elasticsearch/data
---
# Source: elasticsearch/templates/ingress.yaml
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: elasticsearch-master
labels:
app: elasticsearch
release: dummy-elasticsearch
heritage: Helm
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: dummy-elasticsearch.eastus2.cloudapp.azure.com
http:
paths:
- path: /
backend:
serviceName: elasticsearch-master
servicePort: 9200
NOTES:
1. Watch all cluster members come up.
$ kubectl get pods --namespace=dummy-elasticsearch -l app=elasticsearch-master -w
2. Test cluster health using Helm test.
$ helm test dummy-elasticsearch --cleanup
Describe the bug:
Steps to reproduce:
- Fetch https://github.com/elastic/helm-charts/tree/7.9
- Navigate to
elasticsearch
directory helm install dummy-elasticsearch . --namespace dummy-elasticsearch --create-namespace -f values.yaml
(values are customized, as seen in the summary output above)kubectl describe pod elasticsearch-master-0 --namespace=dummy-elasticsearch
Expected behavior: The pod is heatlhy.
Provide logs and/or server output (if relevant):
Instead:
Volumes:
elasticsearch-master:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: elasticsearch-master-elasticsearch-master-0
ReadOnly: false
default-token-b65j6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-b65j6
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned dummy-elasticsearch/elasticsearch-master-0 to aks-agentpool-41636598-0
Normal SuccessfulAttachVolume 14m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-d996de8f-2847-411f-9852-571bd73a7b9e"
Normal Pulled 14m kubelet, aks-agentpool-41636598-0 Container image "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT" already present on machine
Normal Created 14m kubelet, aks-agentpool-41636598-0 Created container configure-sysctl
Normal Started 14m kubelet, aks-agentpool-41636598-0 Started container configure-sysctl
Normal Pulled 14m kubelet, aks-agentpool-41636598-0 Container image "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT" already present on machine
Normal Created 14m kubelet, aks-agentpool-41636598-0 Created container elasticsearch
Normal Started 14m kubelet, aks-agentpool-41636598-0 Started container elasticsearch
Warning Unhealthy 13m kubelet, aks-agentpool-41636598-0 Readiness probe failed: Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=2s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=2s" )
Any additional context:
I tried to use wait_for_status=yellow&timeout=2s
, as suggested in https://github.com/elastic/helm-charts/issues/783#issuecomment-701037663, but it did not help.
kubectl exec --stdin --tty elasticsearch-master-0 --namespace=dummy-elasticsearch -- /bin/bash
[elasticsearch@elasticsearch-master-0 ~]$ curl http://localhost:9200/_cluster/health
{"cluster_name":"elasticsearch","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}[elasticsearch@elasticsearch-master-0 ~]$
Issue Analytics
- State:
- Created 3 years ago
- Reactions:10
- Comments:14 (5 by maintainers)
Top Results From Across the Web
Readiness probe failed: Waiting for elasticsearch cluster to ...
Hi, I'm running ELK stack using helm charts through the elastic community operator git repo. My ES is a single node cluster, today...
Read more >K8s Elasticsearch with filebeat is keeping 'not ready' after ...
Issue. There is an issue with elasticsearch readiness probe when running on single replica cluster. Warning Unhealthy 91s (x14 over 3m42s) ...
Read more >elasticsearch-data pods are in not ready state because of ...
elasticsearch -data pods are in not ready state because of readiness probe failed. Hi, really sorry for the repost, we have Elasticsearch ......
Read more >elasticsearch 7.2.0 - Artifact Hub
Official Elastic helm chart for Elasticsearch. ... It does this by waiting for the cluster health to become green after each instance is...
Read more >Kubernetes Elasticsearch Cluster yaml - Studyk8s
... TCP readinessProbe: exec: command: - sh - -c - | #!/usr/bin/env bash -e # If the node is starting up wait for...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I had the same issue with probe readiness. Tried the
http probe
solution which worked for me but the main issue was that I disabled thecreateCert: false
and setxpack.security.enabled: false
so the protocol value in values.yaml file should be changed tohttp
which by default ishttps
.By the way, readyness here is really too restrictive and most of the time, prevent elasticsearch to start after crash. Especially
wait_for_status=green
flag.I end up by doing a simple
http
probe, so pod become ready quicker, much better for master nodes.