question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Intermittent "Connection is closed" errors

See original GitHub issue

We are currently working on a Lambda function, which connects to a Redis 3.2.10 cluster on AWS Elasticache.

This Lambda function will connect to the Redis cluster, run KEYS on each master node, collect the responses from each node and return an array of keys. We then publish an SNS message for each key in this array, then close the cluster connection, before the Lambda ends.

AWS Lambda freezes and thaws the container in which programs run. So, ideally we would create a connection once then re-use it on every invocation. However, we have found that for the Lambda to end, we must explicitly end the client connection to the cluster as Lambda waits for the Node event loop to empty before the Lambda ends. This is why we create the connection at the start of the function (representing a Lambda invocation) run our queries and then when this completes we attempt to gracefully .quit() the Redis.Cluster connection.

I can’t share the actual code that we’re working on, but I’ve been able to extract the logic and create a simple example of the issue we’re facing:

test.js

const Redis = require("ioredis");

let interval = setInterval(() => {
  let conn = new Redis.Cluster(["cluster.id.clustercfg.euw1.cache.amazonaws.com"]);

  Promise
    .all(conn.nodes("master").map((node) => {
      return node.keys("*:*");
    }))

    .then((resp) => {
      console.log("Complete KEYS on all nodes", JSON.stringify(resp));
      return conn.quit()
    })

    .then(() => {
      console.log("Gracefully closed connection");
    })

    .catch((e) => {
      console.log("Caught rejection: ", e.message);
    })
}, 500);

setTimeout(() => {
  clearInterval(interval);
}, 3000);

Example output:

  ioredis:cluster status: [empty] -> connecting +0ms
  ioredis:redis status[cluster.id.clustercfg.euw1.cache.amazonaws.com:6379]: [empty] -> wait +5ms
  ioredis:cluster getting slot cache from cluster.id.clustercfg.euw1.cache.amazonaws.com:6379 +1ms
  ioredis:redis status[cluster.id.clustercfg.euw1.cache.amazonaws.com:6379]: wait -> connecting +2ms
  ioredis:redis queue command[0] -> cluster(slots) +1ms
  ioredis:redis queue command[0] -> keys(*:*) +1ms
  ioredis:redis status[10.1.0.45:6379]: connecting -> connect +21ms
  ioredis:redis write command[0] -> info() +0ms
  ioredis:redis status[10.1.0.45:6379]: connect -> ready +5ms
  ioredis:connection send 2 commands in offline queue +1ms
  ioredis:redis write command[0] -> cluster(slots) +0ms
  ioredis:redis write command[0] -> keys(*:*) +0ms
  ioredis:redis status[10.1.1.131:6379]: [empty] -> wait +3ms
  ioredis:redis status[10.1.2.152:6379]: [empty] -> wait +1ms
  ioredis:redis status[10.1.0.45:6379]: [empty] -> wait +0ms
  ioredis:cluster status: connecting -> connect +0ms
  ioredis:redis queue command[0] -> cluster(info) +1ms
Complete KEYS on all nodes [["132f28d0-8322-43d6-bbbd-200a19c130c0:tf0NuoVBZIXDIryxBRj3lrcayXeHwaoD"]]
  ioredis:cluster status: connect -> disconnecting +2ms
  ioredis:redis queue command[0] -> quit() +0ms
  ioredis:redis status[10.1.1.131:6379]: wait -> connecting +0ms
  ioredis:redis status[10.1.2.152:6379]: wait -> connecting +0ms
  ioredis:redis status[10.1.0.45:6379]: wait -> connecting +0ms
  ioredis:redis status[10.1.1.131:6379]: connecting -> end +2ms
  ioredis:redis status[10.1.2.152:6379]: connecting -> end +0ms
  ioredis:redis status[10.1.0.45:6379]: connecting -> end +0ms
  ioredis:redis status[10.1.0.45:6379]: ready -> close +1ms
  ioredis:connection skip reconnecting since the connection is manually closed. +1ms
  ioredis:redis status[10.1.0.45:6379]: close -> end +0ms
  ioredis:cluster status: disconnecting -> close +2ms
  ioredis:cluster status: close -> end +0ms
Caught rejection:  Connection is closed.
  ioredis:delayqueue send 1 commands in failover queue +100ms
  ioredis:cluster status: end -> disconnecting +2ms
// SNIP

Why would we be getting the Connection is closed rejection error? This feels like a bug, as I think we are going about this in the correct way, but I’m happy to be proved wrong!

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:9
  • Comments:30 (3 by maintainers)

github_iconTop GitHub Comments

14reactions
JoeNylandcommented, May 24, 2018

I’m commenting here to confirm that this issue is still cropping up for us.

I’m not really sure why the bot above adds a “wontfix” label to an issue that hasn’t had any recent activity 🤔

7reactions
elliotttfcommented, Mar 27, 2018

I’ve also been able to reproduce this problem but only in AWS.

I believe the problem is related to the offline queue. The error originates when the close() method is called from the event_handler. The error eventually bubbles up in the redis class when flushQueue() is executed with a non-empty offline queue.

The commandQueue also occasionally causes this problem but it’s much less frequent.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting intermittent outbound connection errors in ...
The major cause for intermittent connection issues is hitting a limit while making new outbound connections. The limits you can hit include:.
Read more >
** Troubleshooting ** Intermittent 'Connection to server lost' or ...
Intermittently the end users will receive the error message. It can occur at any time, whilst performing any action, inside any menu item....
Read more >
How To Fix An Intermittent Internet Connection In Windows 10
If your router can't maintain a steady connection, check if it's overheating and power it off until it cools down, or try power...
Read more >
Remote host closed connection. Possible SSL/TLS ...
You are seeing intermittent connection errors. This can range from various connectors, anywhere from using the HTTP requester or a DB connection ......
Read more >
Troubleshoot intermittent connectivity with Amazon Redshift
Intermittent connectivity issues in your Amazon Redshift cluster are caused by the following: ... "Error setting/closing connection".
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found