question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

helm upgrade fails when trying to resize PVC

See original GitHub issue

Chart version: 6.8.13 Kubernetes version: v1.16.13-gke.401 Kubernetes provider: E.g. GKE (Google Kubernetes Engine) GKE Helm Version: v2.16.1

helm get release output

e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting.

If you find some secrets in plain text in helm get release output you should use Kubernetes Secrets to managed them is a secure way (see Security Example).

Output of helm get release
REVISION: 20
RELEASED: Wed Nov 11 11:30:44 2020
CHART: elasticsearch-6.8.13
USER-SUPPLIED VALUES:
esJavaOpts: -Xmx5g -Xms5g
minimumMasterNodes: 5
replicas: 8
resources:
  limits:
    cpu: 1600m
    memory: 9Gi
  requests:
    cpu: 1200m
    memory: 8Gi
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xmx5g -Xms5g
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 6.8.13
ingress:
  annotations: {}
  enabled: false
  hosts:
  - chart-example.local
  path: /
  tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 5
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
  annotations: {}
  enabled: true
  labels:
    enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
podSecurityPolicy:
  create: false
  name: ""
  spec:
    fsGroup:
      rule: RunAsAny
    privileged: true
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
    - secret
    - configMap
    - persistentVolumeClaim
priorityClassName: ""
protocol: http
rbac:
  create: false
  serviceAccountAnnotations: {}
  serviceAccountName: ""
readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  successThreshold: 3
  timeoutSeconds: 5
replicas: 8
resources:
  limits:
    cpu: 1600m
    memory: 9Gi
  requests:
    cpu: 1200m
    memory: 8Gi
roles:
  data: "true"
  ingest: "true"
  master: "true"
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  externalTrafficPolicy: ""
  httpPortName: http
  labels: {}
  labelsHeadless: {}
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  nodePort: ""
  transportPortName: transport
  type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
  enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

HOOKS:
---
# elasticsearch-prod-syael-test
apiVersion: v1
kind: Pod
metadata:
  name: "elasticsearch-prod-syael-test"
  annotations:
    "helm.sh/hook": test-success
spec:
  securityContext:
    fsGroup: 1000
    runAsUser: 1000

  containers:
  - name: "elasticsearch-prod-guqjb-test"
    image: "docker.elastic.co/elasticsearch/elasticsearch:6.8.13"
    imagePullPolicy: "IfNotPresent"
    command:
      - "sh"
      - "-c"
      - |
        #!/usr/bin/env bash -e
        curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
  restartPolicy: Never
MANIFEST:

---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: "elasticsearch-master-pdb"
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master-headless
  labels:
    heritage: "Tiller"
    release: "elasticsearch-prod"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
  # Create endpoints also if the related pod isn't ready
  publishNotReadyAddresses: true
  selector:
    app: "elasticsearch-master"
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch-prod"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    {}

spec:
  type: ClusterIP
  selector:
    release: "elasticsearch-prod"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  ports:
  - name: http
    protocol: TCP
    port: 9200
  - name: transport
    protocol: TCP
    port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch-prod"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    esMajorVersion: "6"
spec:
  serviceName: elasticsearch-master-headless
  selector:
    matchLabels:
      app: "elasticsearch-master"
  replicas: 8
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-master
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 50Gi

  template:
    metadata:
      name: "elasticsearch-master"
      labels:
        heritage: "Tiller"
        release: "elasticsearch-prod"
        chart: "elasticsearch"
        app: "elasticsearch-master"
      annotations:

    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000

      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "elasticsearch-master"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      enableServiceLinks: true
      initContainers:
      - name: configure-sysctl
        securityContext:
          runAsUser: 0
          privileged: true
        image: "docker.elastic.co/elasticsearch/elasticsearch:6.8.13"
        imagePullPolicy: "IfNotPresent"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        resources:
          {}


      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000

        image: "docker.elastic.co/elasticsearch/elasticsearch:6.8.13"
        imagePullPolicy: "IfNotPresent"
        readinessProbe:
          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                # Disable nss cache to avoid filling dentry cache when calling curl
                # This is required with Elasticsearch Docker using nss < 3.52
                export NSS_SDB_USE_CACHE=no

                http () {
                  local path="${1}"
                  local args="${2}"
                  set -- -XGET -s

                  if [ "$args" != "" ]; then
                    set -- "$@" $args
                  fi

                  if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                    set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                  fi

                  curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
                }

                if [ -f "${START_FILE}" ]; then
                  echo 'Elasticsearch is already running, lets check the node is healthy'
                  HTTP_CODE=$(http "/" "-w %{http_code}")
                  RC=$?
                  if [[ ${RC} -ne 0 ]]; then
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
                    exit ${RC}
                  fi
                  # ready if HTTP code 200, 503 is tolerable if ES version is 6.x
                  if [[ ${HTTP_CODE} == "200" ]]; then
                    exit 0
                  elif [[ ${HTTP_CODE} == "503" && "6" == "6" ]]; then
                    exit 0
                  else
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
                    exit 1
                  fi

                else
                  echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
                  if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
                    touch ${START_FILE}
                    exit 0
                  else
                    echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
                    exit 1
                  fi
                fi
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5

        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1600m
            memory: 9Gi
          requests:
            cpu: 1200m
            memory: 8Gi

        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.zen.minimum_master_nodes
            value: "5"
          - name: discovery.zen.ping.unicast.hosts
            value: "elasticsearch-master-headless"
          - name: cluster.name
            value: "elasticsearch"
          - name: network.host
            value: "0.0.0.0"
          - name: ES_JAVA_OPTS
            value: "-Xmx5g -Xms5g"
          - name: node.data
            value: "true"
          - name: node.ingest
            value: "true"
          - name: node.master
            value: "true"
        volumeMounts:
          - name: "elasticsearch-master"
            mountPath: /usr/share/elasticsearch/data

Describe the bug: Error: UPGRADE FAILED: StatefulSet.apps “elasticsearch-master” is invalid: spec: Forbidden: updates to statefulset spec for fields other than ‘replicas’, ‘template’, and ‘updateStrategy’ are forbidden Steps to reproduce:

  1. Freshly deployed helm release
  2. Increase PV size from 30Gi to e.g 50Gi in values.yaml (when working with helmfile)
volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 50Gi
  1. helmfile apply

Expected behavior: Upgrade succeeded and PV size will be automatically increased. Provide logs and/or server output (if relevant):

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting

Any additional context: There have been some other issues about where it is mentioned that the label in statefulset is the problem. I read that to get PV automatically resizing with Statefulset work the Elasticsearch must be split into multi Helm releases. Questions:

  1. If that is the problem what is changed in the Statefulset which cannot be changed? Is it template?
  2. What is the reason behind? Why multiple releases would solve it?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
elwood218commented, Nov 12, 2020

Hi @jmlrt , Ah ok thanks for clearing that! I will further reading in that and thanks for providing the workaround. I will write that down if we have such an issue in productive systems.

So I guess we can close that ticket then.

0reactions
elwood218commented, Nov 16, 2020

You are right. Didn’t remember the operator…

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to change PVC size for statefulset with helm chart for new ...
yaml, I want to be able to: upgrade release A from chart version 1.0.0 to 1.0.1 while preserving the volume size (1GB); install...
Read more >
Resizing Persistent Volumes using Kubernetes
This feature allows users to easily resize an existing volume by editing the PersistentVolumeClaim (PVC) object.
Read more >
Troubleshoot persistence volumes issues
To check if the status of your persistence volumes, run the kubectl get pvc command. If the output message shows that your PVC...
Read more >
Troubleshoot issues with EBS volume mounts in Amazon EKS
I'm receiving the following error in my pods when mounting Amazon Elastic Block Store (Amazon EBS) volumes in my Amazon Elastic Kubernetes ...
Read more >
Extend PVC of logging service - Veeam Community
I am trying to extend logging service pvc but it is failing: helm ... helm upgrade k10 kasten/k10 --namespace=kasten-io -f k10_val.yaml \
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found