Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Worker Container OOMKilled in Kubernetes Deployment

See original GitHub issue

Alluxio Version: 2.6.1

Describe the bug We use Helm to deploy an Alluxio cluster on Kubernetes following the doc. We configured to run single master and 8 workers, for memory, we set 32G for the master and 64G for each worker. We are seeing the workers’ memory usage keeps growing exceed the k8s container limit and triggering OOM.

➜  ~ kubectl get pod
NAME                   READY   STATUS             RESTARTS   AGE
alluxio-master-0       2/2     Running            1          110m
alluxio-worker-59ggl   0/2     CrashLoopBackOff   29         122m
alluxio-worker-64fj8   0/2     CrashLoopBackOff   28         122m
alluxio-worker-b42bd   1/2     CrashLoopBackOff   28         122m
alluxio-worker-cbsrj   0/2     CrashLoopBackOff   23         122m
alluxio-worker-gfpbp   1/2     Running            29         122m
alluxio-worker-ghdm7   0/2     CrashLoopBackOff   29         122m
alluxio-worker-mf2gl   0/2     CrashLoopBackOff   29         122m
alluxio-worker-mxzgs   0/2     CrashLoopBackOff   27         122m

To Reproduce Follow the doc to deploy on Kubernetes, use the config to customize some configs and settings. After mount S3 buckets and run for ~30 mins, we start to see worker containers restarts caused by OOMKilled in k8s. Sometimes it gets worse, the pods failed to re-create containers, also caused by OOMKilled.

This is the pod memory usage (w/o cache): The first worker container restart in ~40 mins. memory-usage-1

Then more worker containers restart in next few hours. memory-usage-2

We saw a lot of cache memory in pod’s containers, which is not expected. memory-quota

Expected behavior We expect the worker to run stable and within the memory usage limit.

Urgency This is urgent, we are using Alluxio in production to speed up I/O and optimize Spark data loading.

Additional context We suspect the ramdisk usage is somehow added into k8s memory usage and triggered OOM. We tried to set 120 GB k8s pod limit and 100 GB ramdisk size. Now we clearly see the memory usage (w/o cache) hit 20 GB (= 120GB - 100 GB) and then container restarted. memory-usage-3

➜  ~ kubectl describe pod alluxio-worker-8mkpc
Containers:
  alluxio-worker:
    State:          Running
      Started:      Tue, 24 Aug 2021 11:40:31 -0700
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 24 Aug 2021 09:56:35 -0700
      Finished:     Tue, 24 Aug 2021 11:40:27 -0700
    Ready:          True
    Restart Count:  3
    Limits:
      cpu:     4
      memory:  120G


➜  ~ kubectl exec -it alluxio-worker-8mkpc -c alluxio-worker -- bash
alluxio@alluxio-worker-8mkpc:/opt/alluxio-2.6.1$ alluxio getConf alluxio.worker.ramdisk.size
100G


alluxio@alluxio-worker-8mkpc:/opt/alluxio-2.6.1$ df -h
Filesystem                   Size  Used Avail Use% Mounted on
ramdisk                      100G   99G  1.1G  99% /mnt/alluxio-k8s/ramdisk

Issue Analytics

State:
Created 2 years ago
Comments:11 (9 by maintainers)

Top GitHub Comments

1reaction

yuzhucommented, Aug 30, 2021

@ffcai Thanks for reporting back your findings. Glad it works now.

@ZhuTopher do you mind adding some notes in our doc mentioning that pod memory limit needs to include ramdisk allocation as well? We can close the issue once doc is updated.

1reaction

ffcaicommented, Aug 30, 2021

Hi @ZhuTopher I just read through https://github.com/Alluxio/alluxio/issues/13022, the issue we ran into is very close to it, and it took us quite a few days to realize the pod memory limit needs to take account of the memory-backed volume, same as your explanation and conclusion.

Now we have increased worker’s pod memory limit to 120 GB, which include the 50 GB ramdisk(memory-backed volume), the worker’s jvm heap size is under 20 GB, the Alluxio cluster is running stable. We are also trying 500 GB pod memory limit with 400 GB ramdisk size when k8s node is not running heavy jobs, and will continue to monitor the cluster.

I think we need to add this to the doc Deploy Alluxio on Kubernetes. I believe more and more Alluxio deployment will be cloud native and on k8s. Although it’s more a k8s issue because of the current lack of support on memory-backed volume, Alluxio community can make it more clear for users. There’s no reason to blame on Alluxio for this, as I said when discussed this issue with @yuzhu earlier last week, but if we can add a note in the installation doc, that would be very helpful.

We actually tried both emptyDir and hostPath, I’ll post some test results with more details later. Overall, it’s the same thing about memory limit discussed in https://github.com/Alluxio/alluxio/issues/13022. Thanks everyone for the help!

Top Results From Across the Web

How to Fix OOMKilled Kubernetes Error (Exit Code 137)

The OOMKilled error, also indicated by exit code 137, means that a container or pod was terminated because they used more memory than...

OOMKilled: Troubleshooting Kubernetes Memory Requests ...

This tells Kubernetes that this particular container needs, at minimum, this much memory. Kubernetes will guarantee the memory is available when it places ......

Kubernetes Tip: How Does OOMKilled Work? - Medium

When OOMKilled occurs, we tend to recalibrate the pod's QoS or move the pod to a different node thinking there is a memory...

How to troubleshoot Kubernetes OOM and CPU Throttle - Sysdig

In Kubernetes, limits are applied to containers, not pods, so monitor the memory usage of a container vs. the limit of that container....

How to alert for Pod Restart & OOMKilled in Kubernetes - Blog

Here's the list of cadvisor k8s metrics when using Prometheus. Container Restart Metric. For monitoring the container restarts, kube-state- ...