question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Blocking not working properly in case of reconnections

See original GitHub issue

There is this older issue https://github.com/luin/ioredis/issues/610 also created by me. But since this is rather important I would like to create a new one with a very specific code to reproduce the issue easily so that it can be resolved once and for all 😃.

This is the code:


async function main() {
  const client = new Redis();

  client.on('error', err => console.log('Redis error', err));
  client.on('reconnecting', msg => console.log('Redis reconnecting...', msg));
  client.on('close', () => console.log('Redis closed...'));
  client.on('connect', () => console.log('Redis connected...'));

  while (true) {
    try {
      console.log('going to block');
      const value = await client.brpoplpush('a', 'b', 4);
      console.log('unblocked', value);
    } catch (err) {
      console.error('ERROR', err);
    }
  }
}

main();

How to reproduce

Just run the code above with a local redis server. The while loop will run forever outputting the following:

going to block
Redis connected...
unblocked null
going to block
unblocked null
going to block
unblocked null
going to block

It just blocks for up to 4 seconds, then unblocks, and so on. Now, while this program is running just stop the redis server, wait a couple of seconds and start it again:

Output would look like this:

Redis closed...
Redis reconnecting... 50
Redis connected...
Redis closed...
Redis reconnecting... 100
Redis error Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) {
  errno: 'ECONNREFUSED',
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
...
...
Redis closed...
Redis reconnecting... 700
Redis connected...

And thats all. The while loop will not continue running as if the call to client.brpoplpush has hanged forever.

Expected results

I expect that as soon as the the client disconnects the call to client.brpoplpush rejects the promise with a connection error. Client code should be able to handle calling again to this blocking command.

I am a bit surprised no one else has reported this issue, I wonder if there is some wrong expectation from my side or if I am using the library incorrectly, if so please let me know since this issue is quite severe for users of Bull/BullMQ libraries.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
JJCellacommented, Oct 17, 2022

I manage to reproduce this issue : it occurs when retryStrategy is returning values less than 50 (exp(0), exp(1), exp(2) in the code snippet below, BullMQ default)

Cannot reproduce with ioredis default retryStrategy Math.min(times * 50, 2000); because returning always numbers >= 50 https://github.com/luin/ioredis/blob/0db2d4f5f27d7106832c934a798e616836d1d0a6/lib/redis/event_handler.ts#L181

ioredis v5.2.3

node -v
v16.16.0

docker run -p 6379:6379 redis

redis-cli
127.0.0.1:6379> INFO
# Server
redis_version:7.0.5
import Redis from 'ioredis';

async function main() {

    // https://github.com/taskforcesh/bullmq/blob/0fb2964151166f2aece0270c54c8cb4f4e2eb898/src/classes/redis-connection.ts#L62
    const retryStrategy = (times: number) => Math.min(Math.exp(times), 20000);
    const client = new Redis({ retryStrategy });

    client.on('error', err => console.log('Redis error', err));
    client.on('reconnecting', msg => console.log('Redis reconnecting...', msg));
    client.on('close', () => console.log('Redis closed...'));
    client.on('connect', () => console.log('Redis connected...'));

    while (true) {
        try {
            console.log('brpoplpush');
            await client.brpoplpush('a', 'b', 4);
            console.log('brpoplpush end');
        } catch (err) {
            console.error('error', err);
        }
    }
}

main().catch();

Output :

brpoplpush
Redis connected...
Redis closed...
Redis reconnecting... 2.718281828459045
Redis connected...
Redis error Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
  errno: -104,
  code: 'ECONNRESET',
  syscall: 'read'
}
Redis closed...
Redis reconnecting... 7.38905609893065
Redis error Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1187:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
Redis closed...
Redis reconnecting... 20.085536923187668
Redis error Error: connect ECONNREFUSED
Redis closed...
Redis reconnecting... 54.598150033144236
Redis error Error: connect ECONNREFUSED
Redis closed...
Redis reconnecting... 148.4131591025766
Redis error Error: connect ECONNREFUSED
Redis closed...
Redis reconnecting... 403.4287934927351
Redis error Error: connect ECONNREFUSED
Redis closed...
Redis reconnecting... 1096.6331584284585
Redis connected...
1reaction
luincommented, Mar 15, 2021

Anyone that can reproduce this issue can you enable the debug log (DEBUG=ioredis* node yourapp) and post the logs here? Also, we just updated a new version (v4.24.2) with a connection-related fix, I don’t think they’re related though but can you try with the latest version to make sure?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using the Blocking Connection with connection recovery with ...
In case a node fails, stops, or becomes unavailable, clients should be able to connect to another node and continue. To simplify reconnection...
Read more >
Understand and resolve SQL Server blocking problems
For information specific to troubleshooting blocking in Azure SQL Database, see Understand and resolve Azure SQL Database blocking problems.
Read more >
Configuring Reconnection Strategies | MuleSoft Documentation
By contrast, with a reconnection strategy in place, the system loses the first message that fails (since FTP is not transactional) but once...
Read more >
Troubleshooting connection issues | Socket.IO
If that's not the case, please check that the Socket.IO server is running, and that there is nothing in between that prevents the...
Read more >
Best practices: Redis clients and Amazon ElastiCache for Redis
In case of a blocking operation such as BLPOP, the best practice is to set the command timeout to a number lower than...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found