Elasticsearch pods fail to startup
See original GitHub issueChart Version
7.3.0
Kubernetes Version
1.13
Kubernetes provider
AWS (EKS cluster)
Helm Version
2.11.0
Everytime I run helm install --name elasticsearch elastic/elasticsearch --namespace elasticsearch --set resources.requests.memory=1.5Gi --set resources.limits.memory=1.5Gi --tiller-namespace spinnaker
it appears to be deploying the master pods but they don’t start.
NAME: elasticsearch
LAST DEPLOYED: Wed Sep 11 17:49:30 2019
NAMESPACE: elasticsearch
STATUS: DEPLOYED
RESOURCES:
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
elasticsearch-master-0 0/1 Init:0/1 0 1s
elasticsearch-master-1 0/1 Terminating 0 5m46s
elasticsearch-master-2 0/1 Terminating 4 5m46s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-master ClusterIP 172.20.72.18 <none> 9200/TCP,9300/TCP 1s
elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 1s
==> v1beta1/PodDisruptionBudget
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
elasticsearch-master-pdb N/A 1 0 1s
==> v1beta1/StatefulSet
NAME READY AGE
elasticsearch-master 0/3 1s
NOTES:
1. Watch all cluster members come up.
$ kubectl get pods --namespace=elasticsearch -l app=elasticsearch-master -w
2. Test cluster health using Helm test.
$ helm test elasticsearch
Here is the output of running helm get elasticsearch --tiller-namespace spinnaker
REVISION: 1
RELEASED: Wed Sep 11 17:49:30 2019
CHART: elasticsearch-7.3.0
USER-SUPPLIED VALUES:
resources:
limits:
memory: 1.5Gi
requests:
memory: 1.5Gi
COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.3.0
ingress:
annotations: {}
enabled: false
hosts:
- chart-example.local
path: /
tls: []
initResources: {}
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 2
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
annotations: {}
enabled: true
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
fsGroup: 1000
priorityClassName: ""
protocol: http
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
replicas: 3
resources:
limits:
cpu: 1000m
memory: 1.5Gi
requests:
cpu: 100m
memory: 1.5Gi
roles:
data: "true"
ingest: "true"
master: "true"
schedulerName: ""
secretMounts: []
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
service:
annotations: {}
nodePort: null
type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
HOOKS:
---
# elasticsearch-sjzcy-test
apiVersion: v1
kind: Pod
metadata:
name: "elasticsearch-sjzcy-test"
annotations:
"helm.sh/hook": test-success
spec:
containers:
- name: "elasticsearch-sphns-test"
image: "docker.elastic.co/elasticsearch/elasticsearch:7.3.0"
command:
- "sh"
- "-c"
- |
#!/usr/bin/env bash -e
curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
restartPolicy: Never
MANIFEST:
---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: "elasticsearch-master-pdb"
spec:
maxUnavailable: 1
selector:
matchLabels:
app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master
labels:
heritage: "Tiller"
release: "elasticsearch"
chart: "elasticsearch-7.3.0"
app: "elasticsearch-master"
annotations:
{}
spec:
type: ClusterIP
selector:
heritage: "Tiller"
release: "elasticsearch"
chart: "elasticsearch-7.3.0"
app: "elasticsearch-master"
ports:
- name: http
protocol: TCP
port: 9200
- name: transport
protocol: TCP
port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master-headless
labels:
heritage: "Tiller"
release: "elasticsearch"
chart: "elasticsearch-7.3.0"
app: "elasticsearch-master"
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
# Create endpoints also if the related pod isn't ready
publishNotReadyAddresses: true
selector:
app: "elasticsearch-master"
ports:
- name: http
port: 9200
- name: transport
port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: elasticsearch-master
labels:
heritage: "Tiller"
release: "elasticsearch"
chart: "elasticsearch-7.3.0"
app: "elasticsearch-master"
annotations:
esMajorVersion: "7"
spec:
serviceName: elasticsearch-master-headless
selector:
matchLabels:
app: "elasticsearch-master"
replicas: 3
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
name: elasticsearch-master
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
template:
metadata:
name: "elasticsearch-master"
labels:
heritage: "Tiller"
release: "elasticsearch"
chart: "elasticsearch-7.3.0"
app: "elasticsearch-master"
annotations:
spec:
securityContext:
fsGroup: 1000
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "elasticsearch-master"
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 120
volumes:
initContainers:
- name: configure-sysctl
securityContext:
runAsUser: 0
privileged: true
image: "docker.elastic.co/elasticsearch/elasticsearch:7.3.0"
command: ["sysctl", "-w", "vm.max_map_count=262144"]
resources:
{}
containers:
- name: "elasticsearch"
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
image: "docker.elastic.co/elasticsearch/elasticsearch:7.3.0"
imagePullPolicy: "IfNotPresent"
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
limits:
cpu: 1000m
memory: 1.5Gi
requests:
cpu: 100m
memory: 1.5Gi
env:
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: "elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,"
- name: discovery.seed_hosts
value: "elasticsearch-master-headless"
- name: cluster.name
value: "elasticsearch"
- name: network.host
value: "0.0.0.0"
- name: ES_JAVA_OPTS
value: "-Xmx1g -Xms1g"
- name: node.data
value: "true"
- name: node.ingest
value: "true"
- name: node.master
value: "true"
volumeMounts:
- name: "elasticsearch-master"
mountPath: /usr/share/elasticsearch/data
And the ooutput of running kubectl get pods --namespace=elasticsearch -l app=elasticsearch-master
NAME READY STATUS RESTARTS AGE
elasticsearch-master-0 0/1 CrashLoopBackOff 4 4m
elasticsearch-master-1 0/1 Running 0 4m
elasticsearch-master-2 0/1 Error 4 3m
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:16 (5 by maintainers)
Top Results From Across the Web
Common problems | Elastic Cloud on Kubernetes [2.5]
Operator crashes on startup with OOMKilled edit. On very large Kubernetes clusters with many hundreds of resources (pods, secrets, config maps, and so...
Read more >Elasticsearch pods go into CrashLoopBackOff error after ... - IBM
When upgrading CP4BA, it is observed that iaf-systemelasticsearc-es-data-* pods goes into CrashLoopBackOff error and is unable to restart on its own.
Read more >why elasticsearch cluster in kubernetes start so slow
Make sure the volume mount works fine without delay. Check InitContainer completion time . InitContainer startup is sequential , so that can ...
Read more >elasticsearch 19.5.5 · bitnami/bitnami - Artifact Hub
Elasticsearch is a distributed search and analytics engine. ... Spread Constraints for pod assignment spread across your cluster among failure-domains.
Read more >Elasticsearch operator getting timed out while connecting to ...
Try to reach the Elasticsearch service through the Elasticsearch operator pod. If it did not work, check from the node level. ... Try...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Found the problem: antiAffinity was set to hard and but I didn’t have enough nodes so only one pod was being scheduled… since that pod couldn’t go green (not finding enough masters) the deploy hanged.
This issue has been automatically closed because it has not had recent activity since being marked as stale.