question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Readiness probe failed: Waiting for elasticsearch cluster to become ready

See original GitHub issue

Chart version: https://github.com/elastic/helm-charts/tree/7.9

Kubernetes version:

Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.7", GitCommit:"169db3bff4b5fb7722e967c5b6356713f05f15ed", GitTreeState:"clean", BuildDate:"2020-04-03T16:14:09Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes provider: Azure Kubernetes Cluster

Helm Version: version.BuildInfo{Version:"v3.4.1", GitCommit:"c4e74854886b2efe3321e185578e6db9be0a6e29", GitTreeState:"clean", GoVersion:"go1.14.11"}

helm get release output

Output of helm get release
NAME: dummy-elasticsearch
LAST DEPLOYED: Wed Nov 18 10:51:41 2020
NAMESPACE: dummy-elasticsearch
STATUS: deployed
REVISION: 1
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=2s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.9.4-SNAPSHOT
ingress:
  annotations:
    kubernetes.io/ingress.class: nginx
  enabled: true
  hosts:
  - dummy-elasticsearch.eastus2.cloudapp.azure.com
  path: /
  tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 1
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
  annotations: {}
  enabled: true
  labels:
    enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
podSecurityPolicy:
  create: false
  name: ""
  spec:
    fsGroup:
      rule: RunAsAny
    privileged: true
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
    - secret
    - configMap
    - persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
  create: false
  serviceAccountAnnotations: {}
  serviceAccountName: ""
readinessProbe:
  failureThreshold: 4
  initialDelaySeconds: 20
  periodSeconds: 20
  successThreshold: 3
  timeoutSeconds: 10
replicas: 1
resources:
  limits:
    cpu: 1000m
    memory: 2Gi
  requests:
    cpu: 1000m
    memory: 2Gi
roles:
  data: "true"
  ingest: "true"
  master: "true"
  remote_cluster_client: "true"
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  externalTrafficPolicy: ""
  httpPortName: http
  labels: {}
  labelsHeadless: {}
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  nodePort: ""
  transportPortName: transport
  type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
  enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=2s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.9.4-SNAPSHOT
ingress:
  annotations:
    kubernetes.io/ingress.class: nginx
  enabled: true
  hosts:
  - dummy-elasticsearch.eastus2.cloudapp.azure.com
  path: /
  tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 1
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
  annotations: {}
  enabled: true
  labels:
    enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
podSecurityPolicy:
  create: false
  name: ""
  spec:
    fsGroup:
      rule: RunAsAny
    privileged: true
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
    - secret
    - configMap
    - persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
  create: false
  serviceAccountAnnotations: {}
  serviceAccountName: ""
readinessProbe:
  failureThreshold: 4
  initialDelaySeconds: 20
  periodSeconds: 20
  successThreshold: 3
  timeoutSeconds: 10
replicas: 1
resources:
  limits:
    cpu: 1000m
    memory: 2Gi
  requests:
    cpu: 1000m
    memory: 2Gi
roles:
  data: "true"
  ingest: "true"
  master: "true"
  remote_cluster_client: "true"
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  externalTrafficPolicy: ""
  httpPortName: http
  labels: {}
  labelsHeadless: {}
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  nodePort: ""
  transportPortName: transport
  type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
  enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

HOOKS:
---
# Source: elasticsearch/templates/test/test-elasticsearch-health.yaml
apiVersion: v1
kind: Pod
metadata:
  name: "dummy-elasticsearch-yedjv-test"
  annotations:
    "helm.sh/hook": test-success
spec:
  securityContext:
    fsGroup: 1000
    runAsUser: 1000
  containers:
  - name: "dummy-elasticsearch-piqwt-test"
    image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
    imagePullPolicy: "IfNotPresent"
    command:
      - "sh"
      - "-c"
      - |
        #!/usr/bin/env bash -e
        curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=2s'
  restartPolicy: Never
MANIFEST:
---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: "elasticsearch-master-pdb"
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Helm"
    release: "dummy-elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    {}
spec:
  type: ClusterIP
  selector:
    release: "dummy-elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  ports:
  - name: http
    protocol: TCP
    port: 9200
  - name: transport
    protocol: TCP
    port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master-headless
  labels:
    heritage: "Helm"
    release: "dummy-elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
  # Create endpoints also if the related pod isn't ready
  publishNotReadyAddresses: true
  selector:
    app: "elasticsearch-master"
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Helm"
    release: "dummy-elasticsearch"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    esMajorVersion: "7"
spec:
  serviceName: elasticsearch-master-headless
  selector:
    matchLabels:
      app: "elasticsearch-master"
  replicas: 1
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-master
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 30Gi
  template:
    metadata:
      name: "elasticsearch-master"
      labels:
        heritage: "Helm"
        release: "dummy-elasticsearch"
        chart: "elasticsearch"
        app: "elasticsearch-master"
      annotations:
        
    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "elasticsearch-master"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      enableServiceLinks: true
      initContainers:
      - name: configure-sysctl
        securityContext:
          runAsUser: 0
          privileged: true
        image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
        imagePullPolicy: "IfNotPresent"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        resources:
          {}

      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
        image: "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT"
        imagePullPolicy: "IfNotPresent"
        readinessProbe:
          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=2s" )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                # Disable nss cache to avoid filling dentry cache when calling curl
                # This is required with Elasticsearch Docker using nss < 3.52
                export NSS_SDB_USE_CACHE=no

                http () {
                  local path="${1}"
                  local args="${2}"
                  set -- -XGET -s

                  if [ "$args" != "" ]; then
                    set -- "$@" $args
                  fi

                  if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                    set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                  fi

                  curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
                }

                if [ -f "${START_FILE}" ]; then
                  echo 'Elasticsearch is already running, lets check the node is healthy'
                  HTTP_CODE=$(http "/" "-w %{http_code}")
                  RC=$?
                  if [[ ${RC} -ne 0 ]]; then
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
                    exit ${RC}
                  fi
                  # ready if HTTP code 200, 503 is tolerable if ES version is 6.x
                  if [[ ${HTTP_CODE} == "200" ]]; then
                    exit 0
                  elif [[ ${HTTP_CODE} == "503" && "7" == "6" ]]; then
                    exit 0
                  else
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
                    exit 1
                  fi

                else
                  echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=2s" )'
                  if http "/_cluster/health?wait_for_status=green&timeout=2s" "--fail" ; then
                    touch ${START_FILE}
                    exit 0
                  else
                    echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=2s" )'
                    exit 1
                  fi
                fi
          failureThreshold: 4
          initialDelaySeconds: 20
          periodSeconds: 20
          successThreshold: 3
          timeoutSeconds: 10
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1000m
            memory: 2Gi
          requests:
            cpu: 1000m
            memory: 2Gi
        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: cluster.initial_master_nodes
            value: "elasticsearch-master-0,"
          - name: discovery.seed_hosts
            value: "elasticsearch-master-headless"
          - name: cluster.name
            value: "elasticsearch"
          - name: network.host
            value: "0.0.0.0"
          - name: ES_JAVA_OPTS
            value: "-Xmx1g -Xms1g"
          - name: node.data
            value: "true"
          - name: node.ingest
            value: "true"
          - name: node.master
            value: "true"
          - name: node.remote_cluster_client
            value: "true"
        volumeMounts:
          - name: "elasticsearch-master"
            mountPath: /usr/share/elasticsearch/data
---
# Source: elasticsearch/templates/ingress.yaml
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: elasticsearch-master
  labels:
    app: elasticsearch
    release: dummy-elasticsearch
    heritage: Helm
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
    - host: dummy-elasticsearch.eastus2.cloudapp.azure.com
      http:
        paths:
          - path: /
            backend:
              serviceName: elasticsearch-master
              servicePort: 9200

NOTES:
1. Watch all cluster members come up.
  $ kubectl get pods --namespace=dummy-elasticsearch -l app=elasticsearch-master -w
2. Test cluster health using Helm test.
  $ helm test dummy-elasticsearch --cleanup

Describe the bug:

Steps to reproduce:

  1. Fetch https://github.com/elastic/helm-charts/tree/7.9
  2. Navigate to elasticsearch directory
  3. helm install dummy-elasticsearch . --namespace dummy-elasticsearch --create-namespace -f values.yaml (values are customized, as seen in the summary output above)
  4. kubectl describe pod elasticsearch-master-0 --namespace=dummy-elasticsearch

Expected behavior: The pod is heatlhy.

Provide logs and/or server output (if relevant):

Instead:

Volumes:
  elasticsearch-master:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elasticsearch-master-elasticsearch-master-0
    ReadOnly:   false
  default-token-b65j6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-b65j6
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age   From                               Message
  ----     ------                  ----  ----                               -------
  Normal   Scheduled               15m   default-scheduler                  Successfully assigned dummy-elasticsearch/elasticsearch-master-0 to aks-agentpool-41636598-0
  Normal   SuccessfulAttachVolume  14m   attachdetach-controller            AttachVolume.Attach succeeded for volume "pvc-d996de8f-2847-411f-9852-571bd73a7b9e"
  Normal   Pulled                  14m   kubelet, aks-agentpool-41636598-0  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT" already present on machine
  Normal   Created                 14m   kubelet, aks-agentpool-41636598-0  Created container configure-sysctl
  Normal   Started                 14m   kubelet, aks-agentpool-41636598-0  Started container configure-sysctl
  Normal   Pulled                  14m   kubelet, aks-agentpool-41636598-0  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.9.4-SNAPSHOT" already present on machine
  Normal   Created                 14m   kubelet, aks-agentpool-41636598-0  Created container elasticsearch
  Normal   Started                 14m   kubelet, aks-agentpool-41636598-0  Started container elasticsearch
  Warning  Unhealthy               13m   kubelet, aks-agentpool-41636598-0  Readiness probe failed: Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=2s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=2s" )

Any additional context: I tried to use wait_for_status=yellow&timeout=2s, as suggested in https://github.com/elastic/helm-charts/issues/783#issuecomment-701037663, but it did not help.

kubectl exec --stdin --tty elasticsearch-master-0 --namespace=dummy-elasticsearch -- /bin/bash

[elasticsearch@elasticsearch-master-0 ~]$ curl http://localhost:9200/_cluster/health
{"cluster_name":"elasticsearch","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}[elasticsearch@elasticsearch-master-0 ~]$

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:10
  • Comments:14 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
amirEBDcommented, May 28, 2022

I had the same issue with probe readiness. Tried the http probe solution which worked for me but the main issue was that I disabled the createCert: false and set xpack.security.enabled: false so the protocol value in values.yaml file should be changed to http which by default is https.

2reactions
ebuildycommented, Jul 9, 2021

By the way, readyness here is really too restrictive and most of the time, prevent elasticsearch to start after crash. Especially wait_for_status=green flag.

I end up by doing a simple http probe, so pod become ready quicker, much better for master nodes.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Readiness probe failed: Waiting for elasticsearch cluster to ...
Hi, I'm running ELK stack using helm charts through the elastic community operator git repo. My ES is a single node cluster, today...
Read more >
K8s Elasticsearch with filebeat is keeping 'not ready' after ...
Issue. There is an issue with elasticsearch readiness probe when running on single replica cluster. Warning Unhealthy 91s (x14 over 3m42s) ...
Read more >
elasticsearch-data pods are in not ready state because of ...
elasticsearch -data pods are in not ready state because of readiness probe failed. Hi, really sorry for the repost, we have Elasticsearch ......
Read more >
elasticsearch 7.2.0 - Artifact Hub
Official Elastic helm chart for Elasticsearch. ... It does this by waiting for the cluster health to become green after each instance is...
Read more >
Kubernetes Elasticsearch Cluster yaml - Studyk8s
... TCP readinessProbe: exec: command: - sh - -c - | #!/usr/bin/env bash -e # If the node is starting up wait for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found