RSS keeps growing untill a server crash
See original GitHub issueI’ve been triaging this bug for a while now. I’ve been testing with various setups and been stripping down to exclude possible causes. This is what I’ve come up with:
Client - socket.io 1.3.7
var socket = io('http://<my Socket.IO host name>.elasticbeanstalk.com:80', {
reconnection: false,
transports: ['websocket'],
upgrade: false
});
- Not reconnecting is acceptable for our use case and I’ve wanted to reduce hits to the server
- I’ve chose to skip long-polling, since browser support for WebSockets is acceptable for our use case and while researching, there have been claims that it could cause connection-related issues
Server - socket.io 1.3.7
require("appdynamics").profile(<my AppDynamics params, using SSL>);
var server = require('http').createServer(),
io = require('socket.io')(server, {
serveClient: false,
origins: "http://www.<my web host name>.com:*",
transports: ['websocket'],
allowUpgrades: false
});
io.on('connection', function(socket) {
socket.on('disconnect',function() {});
});
server.listen(process.env.PORT);
This is not a sample, it is the actual server.js running right now I’ve been stripping down my functions one by one until I’ve come down to this, Socket.IO boilerplate code.
The reason for this is that by researching this and reading related issues, much of the advice found was blaming badly user-written code for making bad variable scope decisions, leaking memory etc.
This way, I’m pretty sure that none of the code I wrote caused this issue (since this still happens).
To explain options:
- I turned off the client serving and I’m serving client JS from our web host, to reduce hits to the Socket.IO server
- As mentioned, I’ve chose not to use long-polling, to exclude it from issues
Hardware
- Amazon Elastic Beanstalk NodeJS SaaS
- single
m1.medium
instance with 3.75 GB RAM - running without a proxy (to exclude that as the issue as well)
- the swapping is turned off (to exclude that as the issue)
- NodeJS 0.12.6
Traffic
- about 1M connections per day, but it never gets that far, usually crashes around 0.5M
- as you saw before, I’m not emitting any messages
- max number of concurrent users is aproximately 4000
Symptoms
- with new connections RSS (resident set size) keeps rising immediately. Usually, the rate is 400-500 MB per hour
- with no new connections, the RSS just stays the same, it does not drop
- when the RSS hits the hardware limit (3.4 GB), the NodeJS process crashes
- before I had turned the swapping off, after using all of the RAM, the process started swapping, the app started to lag and drop connections
- Contrary to RSS, the heap remains more or less stable, 100-300MB, hinting the issue might be buffer-related
Monitoring
- using AppDynamics NodeJS agent (note: this issue have also been appearing when I had NewRelic agent instead and even when I had no monitoring agent and only monitored the server stats from Munin)
- This is a screenshot from this morning. The drop from 9:46 is a NodeJS restart. Hint: The increasing number of connections is questionable, since Google Analytics reports lower and more stable numbers.
Issue Analytics
- State:
- Created 8 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Memory leak, growing RSS and debugging - Google Groups
However it does not feel like a "leak" (where RSS grows until the process crashes), it just fills "most" of the system memory...
Read more >httpd hogging all memory until server crash
It keeps going through a loop processing all of the reports in the queue until it is done. But it doesn't remember anything...
Read more >how to know if the server runs out of RAM before crashing down
"Running out of memory" is not usually enough to completely crash ... Check what happened before that, which caused your server to crash....
Read more >Cisco AMP for Endpoints on Windows 2016 grabs more and ...
On our Windows 2016 Servers, Cisco AMP for Endpoints gradually takes more and more memory until the server crashes with memory exhaustion ......
Read more >Node.js memory leak, despite constant Heap + RSS sizes
According to my server monitoring, my memory usage is creeping up over ... After ~4 weeks of uptime, it ends up causing problems...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@nebkam I tracked it and realised that its because of poor internet connectivity at the client side, the memory usage will goes up quickly. I am hoping that socket.io would have an option to periodically check for the status of client’s connectivity and closes them.
Thanks for the library recommendation. Will definitely look into it.
@komorebi-san After experimenting with different adapters for rooms (for example - MongoDB instead of memory), I noticed a noticeable amount of sockets not being cleaned up, they stayed on the server, taking up resources. Back then, I traced it to some older browser not properly disconnecting the socket on browser close. So, I chose to implement the feature without support for older browsers and replaced my Socket.IO code with low-level ws library and made my custom “room” logic. That server is in production to this day, serving 16M pageviews/month.