question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Agent: Improve memory usage

See original GitHub issue

Doing a re-test of #1084, the memory consumption of the Agent component in 0.20.0 still seems relatively high:

With the default memory setting of 512Mb for the agent container in the admin pod, there are OOMKilled errors for the agent container after having created 8000 address ConfigMaps (half anycast, half brokered):

2018-06-04 13:39:50.947 [Thread-4] INFO  AddressController:109 - Total: 7589, Active: 5352, Configuring: 1942, Pending: 295, Terminating: 0, Failed: 0
2018-06-04 13:39:56.451 [Thread-4] INFO  AddressController:109 - Total: 7724, Active: 5352, Configuring: 2126, Pending: 246, Terminating: 0, Failed: 0
2018-06-04 13:40:02.651 [Thread-4] INFO  AddressController:109 - Total: 7897, Active: 5352, Configuring: 2237, Pending: 308, Terminating: 0, Failed: 0
2018-06-04 13:40:08.881 [Thread-4] INFO  AddressController:109 - Total: 8010, Active: 5352, Configuring: 2372, Pending: 286, Terminating: 0, Failed: 0
2018-06-04 13:40:14.574 [Thread-4] INFO  AddressController:109 - Total: 8010, Active: 5352, Configuring: 2545, Pending: 113, Terminating: 0, Failed: 0
2018-06-04 13:40:18.358 [Thread-4] INFO  AddressController:109 - Total: 8010, Active: 5352, Configuring: 2658, Pending: 0, Terminating: 0, Failed: 0
    Image:          docker.io/bsinno/iothub-enmasse-agent:0.20.0_2018-06-01_83_BOSCH
    Image ID:       docker-pullable://bsinno/iothub-enmasse-agent@sha256:140282fddc9721ec5d3536052ef6111d292d0d0e06becb8e79302888e8deeb56
    Ports:          8888/TCP, 8080/TCP, 56720/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Mon, 04 Jun 2018 15:40:16 +0200
      Finished:     Mon, 04 Jun 2018 15:40:56 +0200

With more memory: With a memory setting of 1 GB, an OOMKilled error occurred with 12000 addresses:

2018-06-04 13:51:57.523 [Thread-4] INFO  AddressController:109 - Total: 12010, Active: 8010, Configuring: 4000, Pending: 0, Terminating: 0, Failed: 0
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Mon, 04 Jun 2018 15:49:28 +0200
      Finished:     Mon, 04 Jun 2018 15:52:11 +0200
    Ready:          True
    Restart Count:  1
    Limits:
      memory:  1Gi

For another test, the memory setting of the agent container has been increased to 1500Mb. Adding 12000 addresses shows memory of the admin pod to reach 1200Mb (of course this also includes the “standard controller” container).

2018-06-05 14:35:47.646 [Thread-4] INFO  AddressController:109 - Total: 10, Active: 10, Configuring: 0, Pending: 0, Terminating: 0, Failed: 0
...
2018-06-05 14:56:24.877 [Thread-4] INFO  AddressController:109 - Total: 12010, Active: 12010, Configuring: 0, Pending: 0, Terminating: 0, Failed: 0

kubectl top pod

         NAME                                            CPU(cores)   MEMORY(bytes) 
16:35:48 admin-68ff74466b-4s9c9                          10m          211Mi           
..
16:47:02 admin-68ff74466b-4s9c9                          1118m        930Mi           
16:47:07 admin-68ff74466b-4s9c9                          1241m        998Mi           
16:48:01 admin-68ff74466b-4s9c9                          1241m        998Mi           
16:48:06 admin-68ff74466b-4s9c9                          1055m        1090Mi          
16:49:09 admin-68ff74466b-4s9c9                          1096m        1050Mi          
16:50:07 admin-68ff74466b-4s9c9                          1088m        1140Mi          
16:51:05 admin-68ff74466b-4s9c9                          1088m        1140Mi          
16:52:03 admin-68ff74466b-4s9c9                          1071m        1168Mi          
16:53:00 admin-68ff74466b-4s9c9                          1090m        1012Mi          
16:54:03 admin-68ff74466b-4s9c9                          1112m        1226Mi                  
16:56:04 admin-68ff74466b-4s9c9                          1138m        1037Mi    

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
grscommented, Jul 3, 2018

With latest master I can create 12000 address (half pooled queues, half anycast) without any crashes using the default memory limits (512Mb). Are you still seeing issues?

0reactions
calohmncommented, Jul 4, 2018

@grs good point, I see there’s Node v8.11.2 being used now. I’ll repeat the test.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Improve memory usage · Issue #3 · grafana/agent - GitHub
Create more issues to track specific areas where memory usage can be improved. Image of memory usage comparing Cortex, the agent, and Prometheus ......
Read more >
Reduce Memory Consumption in Agent - Sysdig Documentation
Reduce Memory Consumption in Agent. Sysdig provides a configuration option called thin cointerface to reduce the memory footprint in the agent.
Read more >
High Memory Utilization By System Agent
Older agent versions would typically only average less than 100MB of memory usage. However, the newer agent versions are routinely using ...
Read more >
Excessive memory usage by the Agent Desktop Console
Each switch between tabs can increase memory usage of the Agent Desktop Console by 100-200 Mbytes, and can even reach several GBytes causing ......
Read more >
High memory usage (.NET) - New Relic Documentation
NET agent, you see an increase in Working Set memory usage in monitoring tools such as the Microsoft Windows Task Manager. The increase...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found