Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding a new master hr to perform a rolling upgrade conflicts

See original GitHub issue

Chart version: 6.8.9 -> 7.10.1

Kubernetes version: v1.17.13-gke.2600

Kubernetes provider: GKE

Helm Version: 3

helm get release output

e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting.

If you find some secrets in plain text in helm get release output you should use Kubernetes Secrets to managed them is a secure way (see Security Example).

Output of helm get release

$ helm get values elasticsearch-master  -n elasticsearch
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/zone
clusterName: elasticsearch
esConfig:
  log4j2.properties: |
    logger.deprecation.level = warn
    logger.deprecation.name = org.elasticsearch.deprecation
esJavaOpts: -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -XX:InitialRAMPercentage=50.0
image: gcr.io/company/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imageTag: 6.8.9
keystore:
- secretName: gcs-credentials
labels:
  service: elastic-master
masterService: elasticsearch-master
maxUnavailable: 1
minimumMasterNodes: 2
nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: instancegroup
        operator: In
        values:
        - elasticsearch
nodeGroup: master
nodeSelector:
  instancegroup: elasticsearch
persistence:
  enabled: true
replicas: 3
resources:
  limits:
    cpu: 500m
    memory: 4Gi
  requests:
    cpu: 200m
    memory: 4Gi
roles:
  data: "false"
  ingest: "false"
  master: "true"
tolerations:
- effect: NoSchedule
  key: dedicated
  value: elasticsearch
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

$ helm get values elasticsearch-master-7-10-1  -n elasticsearch
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/zone
clusterName: elasticsearch
esConfig:
  log4j2.properties: |
    logger.deprecation.level = warn
    logger.deprecation.name = org.elasticsearch.deprecation
esJavaOpts: -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -XX:InitialRAMPercentage=50.0
image: gcr.io/company/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imageTag: 7.10.1
keystore:
- secretName: gcs-credentials
labels:
  service: elastic-master-7-10-1
masterService: elasticsearch-master-7-10-1
maxUnavailable: 1
minimumMasterNodes: 2
nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: instancegroup
        operator: In
        values:
        - elasticsearch
nodeGroup: master-7-10-1
nodeSelector:
  instancegroup: elasticsearch
persistence:
  enabled: true
replicas: 1
resources:
  limits:
    cpu: 500m
    memory: 4Gi
  requests:
    cpu: 200m
    memory: 4Gi
roles:
  data: "false"
  ingest: "false"
  master: "true"
tolerations:
- effect: NoSchedule
  key: dedicated
  value: elasticsearch
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Describe the bug:

Steps to reproduce:

Apply HR for master using 6.8.9 with 3 nodes
reduce master to 2 nodes
Apply another HR for second master using 7.10.1 with same nodePool and service
The release fails to start because of duplicated PDB and CM.

Alternatively

Apply HR for master using 6.8.9 with 3 nodes
Reduce master to 2 nodes
Apply another HR for second master using 7.10.1 with slightly modified nodePool but same cluster.
The new master does not join the previously existing nodes.

Expected behavior:

It should be possible to upgrade a master one at a time by creating a second helm release.

Any additional context:

I use helm-operator, and I’m trying to upgrade the elasticsearch version from 6.8.9 to 7.10.1.

I had minimal issues upgrading the data, client and ingest nodes. But when I attempt to perform this method. I am able to scale the masters down to 2.

$ curl -s  "https://elasticsearch.example.com/_cat/nodes?v"| grep master
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.193.100.132           34          33   2    0.54    0.36     0.33 m         *      elasticsearch-master-0
10.193.96.133            58          33   2    1.99    1.41     1.31 m         -      elasticsearch-master-1

But the new node doesn’t join the cluster

$ k exec -it elasticsearch-master-7-10-1-0  curl localhost:9200/_cat/nodes
10.193.107.5 11 31 15 0.93 1.03 1.13 lmr * elasticsearch-master-7-10-1-0

I think this is primarily due to specifying a different nodeGroup

    nodeGroup: "master-7-10-1"
    # "$clusterName-$masterNodeGroup" as "nodeGroup != 'master'"
    masterService: "elasticsearch-master-7-10-1"
    labels:
      service: elastic-master-7-10-1

But when I try to use the same nodeGroup as the original helm release I run into a few errors.

The poddisruption budget is duplicated- I can get around this by setting the maxUnavailable: 0.
The elasticsearch-master-config is duplicated. I can’t get around this.

When upgrading the other node types I am able to create additional helm releases for each without issue. But it seems that this is not possible for the master. Are there any better strategies than doing this manner?

Issue Analytics

State:
Created 3 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

ebuildycommented, Jan 26, 2021

Why you dont upgrade the current Helm release?

Also, this is possible to attach another Helm release to:

masterService --> your current master service

0reactions

bugbcommented, Jul 8, 2021

Ah got it here: https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html

Top Results From Across the Web

Using DBMS_ROLLING to Perform a Rolling Upgrade

To modify any existing rolling upgrade parameter, use the DBMS_ROLLING.SET_PARAMETER PL/SQL procedure. Starting with Oracle Database Release 21c, operations ...

How to Resolve Workplace Conflicts - SHRM

HR professionals and conflict management experts recommend that HR get involved in workplace conflicts when: Employees are threatening to quit ...

Working with Aurora multi-master clusters - AWS Documentation

A situation that occurs when different DB instances attempt to modify the same data page at the same time. Aurora reports a write...

Common development workflow - Drupal

Do not forget to export new configuration. git add -A # Stages new, ... branch and want to update it with the latest...

stakater/Reloader - GitHub

A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Adding a new master hr to perform a rolling upgrade conflicts

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

7.10.1 Elasticsearch pods aren't readying up but curl -k http://localhost:9200/_cluster/health says green

[elasticsearch] Add a way for the readiness check to accept yellow state temporarily