question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

elasticsearch AWS data transfer spike

See original GitHub issue

Chart version: 7.9.3

Kubernetes version: Client Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.4”, GitCommit:“e87da0bd6e03ec3fea7933c4b5263d151aafd07c”, GitTreeState:“clean”, BuildDate:“2021-02-21T20:23:45Z”, GoVersion:“go1.15.8”, Compiler:“gc”, Platform:“darwin/amd64”} Server Version: version.Info{Major:“1”, Minor:“18”, GitVersion:“v1.18.9”, GitCommit:“94f372e501c973a7fa9eb40ec9ebd2fe7ca69848”, GitTreeState:“clean”, BuildDate:“2020-09-16T13:47:43Z”, GoVersion:“go1.13.15”, Compiler:“gc”, Platform:“linux/amd64”}

Kubernetes provider: E.g. GKE (Google Kubernetes Engine)

AWS KOPS

Helm Version: version.BuildInfo{Version:“v3.5.2”, GitCommit:“167aac70832d3a384f65f9745335e9fb40169dc2”, GitTreeState:“dirty”, GoVersion:“go1.15.7”}

helm get release output

e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

unable to run on the cloud because of data transfer spike. as soon as i run i start incurring hundreds of dollars of data transfer

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting.

If you find some secrets in plain text in helm get release output you should use Kubernetes Secrets to managed them is a secure way (see Security Example).

Output of helm get release

Describe the bug:

currently i deploy the configuration using the helm chart stated above and leave running. runs perfectly fine and my apps are able to connect to the service. then i get a massive bill from AWS in data transfer in the hundreds. after researching and investigating issue on aws we have isolated issue to steam from elasticsearch service. i run all other apps and i don’t get crazy bill. below you will find one month data transfer bill from aws

Bandwidth$1,294.65
$0.000 per GB - data transfer in per month1.517 GB$0.00
$0.000 per GB - first 1 GB of data transferred out per month0.567 GB$0.00
$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using elastic IPs or ELB129,464.982 GB$1,294.65

Steps to reproduce:

  1. deploy k8s on aws using kops
  2. deploy elasticsearch
  3. crazy bill

Expected behavior: deploy the service using the helm chart and not incur data transfer bills at the end of each month. maybe something is wrong with my configuration.

Provide logs and/or server output (if relevant):

Be careful to obfuscate every secrets (credentials, token, public IP, …) that could be visible in the output before copy-pasting

Any additional context:

my current configuration

---
clusterName: "elasticsearch"
nodeGroup: "master"

# The service that non master groups will try to connect to when joining the cluster
# This should be set to clusterName + "-" + nodeGroup for your master group
masterService: ""

# Elasticsearch roles that will be applied to this nodeGroup
# These will be set as environment variables. E.g. node.master=true
roles:
  master: "true"
  ingest: "true"
  data: "true"
#  remote_cluster_client: "true" # For latest Versions of es
# ml: "true" # ml is not availble with elasticsearch-oss

replicas: 4
minimumMasterNodes: 1

esMajorVersion: ""

# Allows you to add any config files in /usr/share/elasticsearch/config/
# such as elasticsearch.yml and log4j2.properties
esConfig: {}
#  elasticsearch.yml: |
#    key:
#      nestedkey: value
#  log4j2.properties: |
#    key = value

# Extra environment variables to append to this nodeGroup
# This will be appended to the current 'env:' key. You can use any of the kubernetes env
# syntax here
extraEnvs: []
#  - name: MY_ENVIRONMENT_VAR
#    value: the_value_goes_here

# Allows you to load environment variables from kubernetes secret or config map
envFrom: []
# - secretRef:
#     name: env-secret
# - configMapRef:
#     name: config-map

# A list of secrets and their paths to mount inside the pod
# This is useful for mounting certificates for security and for mounting
# the X-Pack license
secretMounts: []
#  - name: elastic-certificates
#    secretName: elastic-certificates
#    path: /usr/share/elasticsearch/config/certs
#    defaultMode: 0755

hostAliases: []
#- ip: "127.0.0.1"
#  hostnames:
#  - "foo.local"
#  - "bar.local"

image: "docker.elastic.co/elasticsearch/elasticsearch-oss"
imageTag: "6.8.8"
imagePullPolicy: "IfNotPresent"

podAnnotations: {}
# iam.amazonaws.com/role: es-cluster

# additionals labels
labels: {}

#esJavaOpts: "-Xmx1g -Xms1g" # For Production
esJavaOpts: "-Xmx128m -Xms128m" # For Minikube

resources:
  requests:
    cpu: "1000m"
    memory: "2Gi"
  limits:
    cpu: "1000m"
    memory: "2Gi"

initResources: {}
  # limits:
  #   cpu: "25m"
  #   # memory: "128Mi"
  # requests:
  #   cpu: "25m"
#   memory: "128Mi"

sidecarResources: {}
  # limits:
  #   cpu: "25m"
  #   # memory: "128Mi"
  # requests:
  #   cpu: "25m"
#   memory: "128Mi"

networkHost: "0.0.0.0"

volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 30Gi

rbac:
  create: true
  serviceAccountAnnotations: {}
  serviceAccountName: ""

podSecurityPolicy:
  create: false
  name: ""
  spec:
    privileged: true
    fsGroup:
      rule: RunAsAny
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
      - secret
      - configMap
      - persistentVolumeClaim
      - emptyDir

persistence:
  enabled: true
  labels:
    # Add default labels for the volumeClaimTemplate of the StatefulSet
    enabled: false
  annotations: {}

extraVolumes: []
  # - name: extras
#   emptyDir: {}

extraVolumeMounts: []
  # - name: extras
  #   mountPath: /usr/share/extras
#   readOnly: true

extraContainers: []
  # - name: do-something
  #   image: busybox
#   command: ['do', 'something']

extraInitContainers: []
  # - name: do-something
  #   image: busybox
#   command: ['do', 'something']

# This is the PriorityClass settings as defined in
# https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: ""

# By default this will make sure two pods don't end up on the same node
# Changing this to a region would allow you to spread pods across regions
antiAffinityTopologyKey: "kubernetes.io/hostname"

# Hard means that by default pods will only be scheduled if there are enough nodes for them
# and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: "hard"
#antiAffinity: "soft" # For Minikube

# This is the node affinity settings as defined in
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}

# The default is to deploy all pods serially. By setting this to parallel all pods are started at
# the same time when bootstrapping the cluster
podManagementPolicy: "Parallel"

# The environment variables injected by service links are not used, but can lead to slow Elasticsearch boot times when
# there are many services in the current namespace.
# If you experience slow pod startups you probably want to set this to `false`.
enableServiceLinks: true

protocol: http
httpPort: 9200
transportPort: 9300

service:
  labels: {}
  labelsHeadless: {}
  type: ClusterIP
  nodePort: ""
  annotations: {}
  httpPortName: http
  transportPortName: transport
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  externalTrafficPolicy: ""

updateStrategy: RollingUpdate

# This is the max unavailable setting for the pod disruption budget
# The default value of 1 will make sure that kubernetes won't allow more than 1
# of your pods to be unavailable during maintenance
maxUnavailable: 1

podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000

securityContext:
  capabilities:
    drop:
      - ALL
  # readOnlyRootFilesystem: true
  runAsNonRoot: true
  runAsUser: 1000

# How long to wait for elasticsearch to stop gracefully
terminationGracePeriod: 120

sysctlVmMaxMapCount: 262144

readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  successThreshold: 3
  timeoutSeconds: 5

# https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html#request-params wait_for_status
clusterHealthCheckParams: "wait_for_status=green&timeout=1s"

## Use an alternate scheduler.
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
schedulerName: ""

imagePullSecrets: []
nodeSelector: {}
tolerations: []

# Enabling this will publically expose your Elasticsearch instance.
# Only enable this if you have security enabled on your cluster
ingress:
  enabled: false
  annotations: {}
    # kubernetes.io/ingress.class: nginx
  # kubernetes.io/tls-acme: "true"
  path: /
  hosts:
    - chart-example.local
  tls: []
  #  - secretName: chart-example-tls
  #    hosts:
  #      - chart-example.local

nameOverride: "elasticsearch"
fullnameOverride: "elasticsearch"

# https://github.com/elastic/helm-charts/issues/63
masterTerminationFix: false

lifecycle: {}
  # preStop:
  #   exec:
  #     command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
  # postStart:
  #   exec:
  #     command:
  #       - bash
  #       - -c
  #       - |
  #         #!/bin/bash
  #         # Add a template to adjust number of shards/replicas
  #         TEMPLATE_NAME=my_template
  #         INDEX_PATTERN="logstash-*"
  #         SHARD_COUNT=8
  #         REPLICA_COUNT=1
  #         ES_URL=http://localhost:9200
  #         while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
#         curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'

sysctlInitContainer:
  enabled: true

keystore: []

# Deprecated
# please use the above podSecurityContext.fsGroup instead
fsGroup: "" 

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mark-vieiracommented, Jun 21, 2021

I agree with Rory’s comment above, it doesn’t seem clear that this issue is directly related to the helm chart.

1reaction
pugnascotiacommented, Mar 22, 2021

We should move the discussion to https://discuss.elastic.co/, it’s a more appropriate place for digging into this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Taming data transfer costs with Elasticsearch - The Guardian
After a perfect roll-out, we noticed a substantial spike in our data transfer costs, and it was interesting to figure out its origin...
Read more >
Spike in AWS Error Messages | Elastic Security Solution [8.5]
Triage and analysis ### Investigating Spike in AWS Error Messages CloudTrail logging provides visibility on actions taken within an AWS environment.
Read more >
Improve the indexing performance in Amazon OpenSearch ...
I want to optimize indexing operations in Amazon OpenSearch Service for maximum ingestion throughput. How can I do this? Resolution. Be sure ...
Read more >
AWS Data Transfer Costs: Solving Hidden Network Transfer ...
Hidden AWS data transfer costs can lead to higher than expected cloud service bills. This post will help track hidden fees and show...
Read more >
Terrible Ideas for Avoiding AWS Data Transfer Costs
Data transfer between availability zones within a region costs a pile of money (usually 2¢ per GB in the “main” regions, but costs...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found