question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding a new master hr to perform a rolling upgrade conflicts

See original GitHub issue

Chart version: 6.8.9 -> 7.10.1

Kubernetes version: v1.17.13-gke.2600

Kubernetes provider: GKE

Helm Version: 3

helm get release output

e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting.

If you find some secrets in plain text in helm get release output you should use Kubernetes Secrets to managed them is a secure way (see Security Example).

Output of helm get release
$ helm get values elasticsearch-master  -n elasticsearch
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/zone
clusterName: elasticsearch
esConfig:
  log4j2.properties: |
    logger.deprecation.level = warn
    logger.deprecation.name = org.elasticsearch.deprecation
esJavaOpts: -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -XX:InitialRAMPercentage=50.0
image: gcr.io/company/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imageTag: 6.8.9
keystore:
- secretName: gcs-credentials
labels:
  service: elastic-master
masterService: elasticsearch-master
maxUnavailable: 1
minimumMasterNodes: 2
nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: instancegroup
        operator: In
        values:
        - elasticsearch
nodeGroup: master
nodeSelector:
  instancegroup: elasticsearch
persistence:
  enabled: true
replicas: 3
resources:
  limits:
    cpu: 500m
    memory: 4Gi
  requests:
    cpu: 200m
    memory: 4Gi
roles:
  data: "false"
  ingest: "false"
  master: "true"
tolerations:
- effect: NoSchedule
  key: dedicated
  value: elasticsearch
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
$ helm get values elasticsearch-master-7-10-1  -n elasticsearch
USER-SUPPLIED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/zone
clusterName: elasticsearch
esConfig:
  log4j2.properties: |
    logger.deprecation.level = warn
    logger.deprecation.name = org.elasticsearch.deprecation
esJavaOpts: -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -XX:InitialRAMPercentage=50.0
image: gcr.io/company/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imageTag: 7.10.1
keystore:
- secretName: gcs-credentials
labels:
  service: elastic-master-7-10-1
masterService: elasticsearch-master-7-10-1
maxUnavailable: 1
minimumMasterNodes: 2
nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: instancegroup
        operator: In
        values:
        - elasticsearch
nodeGroup: master-7-10-1
nodeSelector:
  instancegroup: elasticsearch
persistence:
  enabled: true
replicas: 1
resources:
  limits:
    cpu: 500m
    memory: 4Gi
  requests:
    cpu: 200m
    memory: 4Gi
roles:
  data: "false"
  ingest: "false"
  master: "true"
tolerations:
- effect: NoSchedule
  key: dedicated
  value: elasticsearch
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
      

Describe the bug:

Steps to reproduce:

  1. Apply HR for master using 6.8.9 with 3 nodes
  2. reduce master to 2 nodes
  3. Apply another HR for second master using 7.10.1 with same nodePool and service
  4. The release fails to start because of duplicated PDB and CM.

Alternatively

  1. Apply HR for master using 6.8.9 with 3 nodes
  2. Reduce master to 2 nodes
  3. Apply another HR for second master using 7.10.1 with slightly modified nodePool but same cluster.
  4. The new master does not join the previously existing nodes.

Expected behavior:

It should be possible to upgrade a master one at a time by creating a second helm release.

Any additional context:

I use helm-operator, and I’m trying to upgrade the elasticsearch version from 6.8.9 to 7.10.1.

I had minimal issues upgrading the data, client and ingest nodes. But when I attempt to perform this method. I am able to scale the masters down to 2.

$ curl -s  "https://elasticsearch.example.com/_cat/nodes?v"| grep master
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.193.100.132           34          33   2    0.54    0.36     0.33 m         *      elasticsearch-master-0
10.193.96.133            58          33   2    1.99    1.41     1.31 m         -      elasticsearch-master-1

But the new node doesn’t join the cluster

$ k exec -it elasticsearch-master-7-10-1-0  curl localhost:9200/_cat/nodes
10.193.107.5 11 31 15 0.93 1.03 1.13 lmr * elasticsearch-master-7-10-1-0

I think this is primarily due to specifying a different nodeGroup

    nodeGroup: "master-7-10-1"
    # "$clusterName-$masterNodeGroup" as "nodeGroup != 'master'"
    masterService: "elasticsearch-master-7-10-1"
    labels:
      service: elastic-master-7-10-1

But when I try to use the same nodeGroup as the original helm release I run into a few errors.

  1. The poddisruption budget is duplicated- I can get around this by setting the maxUnavailable: 0.
  2. The elasticsearch-master-config is duplicated. I can’t get around this.

When upgrading the other node types I am able to create additional helm releases for each without issue. But it seems that this is not possible for the master. Are there any better strategies than doing this manner?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
ebuildycommented, Jan 26, 2021

Why you dont upgrade the current Helm release?

Also, this is possible to attach another Helm release to:

masterService --> your current master service

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using DBMS_ROLLING to Perform a Rolling Upgrade
To modify any existing rolling upgrade parameter, use the DBMS_ROLLING.SET_PARAMETER PL/SQL procedure. Starting with Oracle Database Release 21c, operations ...
Read more >
How to Resolve Workplace Conflicts - SHRM
HR professionals and conflict management experts recommend that HR get involved in workplace conflicts when: Employees are threatening to quit ...
Read more >
Working with Aurora multi-master clusters - AWS Documentation
A situation that occurs when different DB instances attempt to modify the same data page at the same time. Aurora reports a write...
Read more >
Common development workflow - Drupal
Do not forget to export new configuration. git add -A # Stages new, ... branch and want to update it with the latest...
Read more >
stakater/Reloader - GitHub
A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found