question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible memory leak in NodeJS / Python services

See original GitHub issue

Uptime checks for the production deployment of OnlineBoutique have been failing once every few weeks. Looking at kubectl events timed with an uptime check failure –

38m         Warning   NodeSysctlChange   node/gke-online-boutique-mast-default-pool-65a22575-azeq   {"unmanaged": {"net.ipv4.tcp_fastopen_key": "004baa97-3c3b554d-9bbcccf8-870ced36"}}
43m         Warning   NodeSysctlChange   node/gke-online-boutique-mast-default-pool-65a22575-i6m8   {"unmanaged": {"net.ipv4.tcp_fastopen_key": "706b7d5f-9df4b412-e8eb875e-179c4765"}}
46m         Warning   NodeSysctlChange   node/gke-online-boutique-mast-default-pool-65a22575-jvwz   {"unmanaged": {"net.ipv4.tcp_fastopen_key": "a0f734c5-5c9a56e1-06aeb420-0010498e"}}
39m         Warning   OOMKilling         node/gke-online-boutique-mast-default-pool-65a22575-jvwz   Memory cgroup out of memory: Kill process 569290 (node) score 2181 or sacrifice child
Killed process 569290 (node) total-vm:1418236kB, anon-rss:121284kB, file-rss:33236kB, shmem-rss:0kB
39m         Warning   OOMKilling         node/gke-online-boutique-mast-default-pool-65a22575-jvwz   Memory cgroup out of memory: Kill process 2592522 (grpc_health_pro) score 1029 or sacrifice child
Killed process 2592530 (grpc_health_pro) total-vm:710956kB, anon-rss:1348kB, file-rss:7376kB, shmem-rss:0kB

It looks like memory requests are exceeding their limit. There seems to be plenty of allocatable memory across the prod GKE nodes Screen Shot 2021-05-03 at 1 55 56 PM

But as observed by @bourgeoisor, it seems that three of the workloads are using steadily increasing amounts of memory until the pods are killed by GKE.

Currency and payment (NodeJS):

Screen Shot 2021-05-03 at 2 23 13 PM

Recommendation: (Python)

Screen Shot 2021-05-03 at 2 24 16 PM

TODO - investigate possible memory leaks starting with the NodeJS services. Investigate why the services use an increasing amount of memory over time rather than a constant amount. Then investigate the Python services + see if other python services (emailservice, for instance) show the same behavior as recommendation service.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

7reactions
Shabirmeancommented, Nov 26, 2021

According to the profiler data for the currencyservice and serviceservice the request-retry package is the one that seems to be using a lot of memory. It is imported by the google-cloud/common library that is used by google-cloud/tracing, google-cloud/debug and google-cloud/profiler.

The same behaviour is reported in the google-clou/debug nodejs repository. As per this recent comment the issue seems to have been eradicated after disabling google-cloud/debug

I have created for PRs to stage 4 clusters with different settings to observe how the memory usage is over time.

  • #637 - has no google-cloud/debug
  • #638 - has no google-cloud/trace
  • #639 - has no google-cloud/profiler
  • #640 - has all three of the above disabled

image

1reaction
Shabirmeancommented, Jan 12, 2022

Hi @Shabirmean,

Please correct me if I’m wrong. We are now just waiting on this issue to be fixed via googleapis/cloud-debug-nodejs#811. Judging from Ben Coe’s comment, this is something they plan to fix.

Let me know if there is any action we need to take in the meantime.

Hello @NimJay

There isn’t much we can do from our side. I have communicated with Ben and seeing if we can work with the debug team to get that issue (https://github.com/googleapis/cloud-debug-nodejs/issues/811) fixed. Until then, no action is needed/possible from our side. I suggest we keep this issue open!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Find, Fix, and Prevent Node.js Memory Leaks
Memory management for any application is essential. This post looks at what memory leaks are and how to avoid them in Node.js applications....
Read more >
Debugging Memory Leaks in Node.js Applications - Toptal
Memory leaks in long running Node.js applications are like ticking time bombs that, if left unchecked in production environments, can result in devastating ......
Read more >
Understanding Node.js Memory Leaks - AppDynamics
Memory leaks can affect the performance of Node. js applications, causing them to run slowly, malfunction or freeze up completely. Many leaks ......
Read more >
How to Fix Memory Leaks in Python? - Section.io
The Python program, just like other programming languages, experiences memory leaks. Memory leaks in Python happen if the garbage collector ...
Read more >
Node.js Memory Leak Detection: How to Debug & Avoid Them
What Are Memory Leaks in Node.js? ... Long story short, it's when your Node.js app's CPU and memory usage increases over time for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found