question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RSS keeps growing untill a server crash

See original GitHub issue

I’ve been triaging this bug for a while now. I’ve been testing with various setups and been stripping down to exclude possible causes. This is what I’ve come up with:

Client - socket.io 1.3.7

var socket = io('http://<my Socket.IO host name>.elasticbeanstalk.com:80', {
  reconnection: false,
  transports: ['websocket'],
  upgrade: false
});
  • Not reconnecting is acceptable for our use case and I’ve wanted to reduce hits to the server
  • I’ve chose to skip long-polling, since browser support for WebSockets is acceptable for our use case and while researching, there have been claims that it could cause connection-related issues

Server - socket.io 1.3.7

require("appdynamics").profile(<my AppDynamics params, using SSL>);
var server = require('http').createServer(),
  io = require('socket.io')(server, {
    serveClient: false,
    origins: "http://www.<my web host name>.com:*",
    transports: ['websocket'],
    allowUpgrades: false
  });
io.on('connection', function(socket) {
  socket.on('disconnect',function() {});
});
server.listen(process.env.PORT);

This is not a sample, it is the actual server.js running right now I’ve been stripping down my functions one by one until I’ve come down to this, Socket.IO boilerplate code.

The reason for this is that by researching this and reading related issues, much of the advice found was blaming badly user-written code for making bad variable scope decisions, leaking memory etc.

This way, I’m pretty sure that none of the code I wrote caused this issue (since this still happens).

To explain options:

  • I turned off the client serving and I’m serving client JS from our web host, to reduce hits to the Socket.IO server
  • As mentioned, I’ve chose not to use long-polling, to exclude it from issues

Hardware

  • Amazon Elastic Beanstalk NodeJS SaaS
  • single m1.medium instance with 3.75 GB RAM
  • running without a proxy (to exclude that as the issue as well)
  • the swapping is turned off (to exclude that as the issue)
  • NodeJS 0.12.6

Traffic

  • about 1M connections per day, but it never gets that far, usually crashes around 0.5M
  • as you saw before, I’m not emitting any messages
  • max number of concurrent users is aproximately 4000

Symptoms

  • with new connections RSS (resident set size) keeps rising immediately. Usually, the rate is 400-500 MB per hour
  • with no new connections, the RSS just stays the same, it does not drop
  • when the RSS hits the hardware limit (3.4 GB), the NodeJS process crashes
  • before I had turned the swapping off, after using all of the RAM, the process started swapping, the app started to lag and drop connections
  • Contrary to RSS, the heap remains more or less stable, 100-300MB, hinting the issue might be buffer-related

Monitoring

  • using AppDynamics NodeJS agent (note: this issue have also been appearing when I had NewRelic agent instead and even when I had no monitoring agent and only monitored the server stats from Munin)
  • This is a screenshot from this morning. The drop from 9:46 is a NodeJS restart. Hint: The increasing number of connections is questionable, since Google Analytics reports lower and more stable numbers. screenshot from 2015-10-22 11 34 00

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
komorebi-sancommented, Nov 2, 2020

@nebkam I tracked it and realised that its because of poor internet connectivity at the client side, the memory usage will goes up quickly. I am hoping that socket.io would have an option to periodically check for the status of client’s connectivity and closes them.

Thanks for the library recommendation. Will definitely look into it.

0reactions
nebkamcommented, Oct 29, 2020

@komorebi-san After experimenting with different adapters for rooms (for example - MongoDB instead of memory), I noticed a noticeable amount of sockets not being cleaned up, they stayed on the server, taking up resources. Back then, I traced it to some older browser not properly disconnecting the socket on browser close. So, I chose to implement the feature without support for older browsers and replaced my Socket.IO code with low-level ws library and made my custom “room” logic. That server is in production to this day, serving 16M pageviews/month.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory leak, growing RSS and debugging - Google Groups
However it does not feel like a "leak" (where RSS grows until the process crashes), it just fills "most" of the system memory...
Read more >
httpd hogging all memory until server crash
It keeps going through a loop processing all of the reports in the queue until it is done. But it doesn't remember anything...
Read more >
how to know if the server runs out of RAM before crashing down
"Running out of memory" is not usually enough to completely crash ... Check what happened before that, which caused your server to crash....
Read more >
Cisco AMP for Endpoints on Windows 2016 grabs more and ...
On our Windows 2016 Servers, Cisco AMP for Endpoints gradually takes more and more memory until the server crashes with memory exhaustion ......
Read more >
Node.js memory leak, despite constant Heap + RSS sizes
According to my server monitoring, my memory usage is creeping up over ... After ~4 weeks of uptime, it ends up causing problems...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found