Undefined behavior for ZSCAN using readonly cluster client.
See original GitHub issueWhile using a ClusterReadOnlyConnectionPool
, the zscan
and zscan_iter
will not successfully provide the scan guarantees. I believe this will apply to sscan
and hscan
as well, though I’ve not tested.
The issue arises from the overridden get_node_by_slot
on ClusterReadOnlyConnectionPool
. This will randomly choose between any master or slave node that has the slot for the key. So as subsequent scan commands are issued, it will issue them to random nodes. The cursor values Redis returns are stateless, they are effectively offsets, however, they’re not guaranteed to be consistent between master and slave.
An example I used to debug this was on a sorted set that currently contains 164 values, but has fluctuated significantly over its lifetime. If I issue an initial ZSCAN request to the master that owns it, I get the following response:
Command: ZSCAN mysortedset 0
1) "136"
2) 1) "2836834"
2) "0.4802068521853653"
3) "3599906"
4) "0.4656334842469258"
5) "3490931"
6) "0.22173393426291121"
7) "82109"
8) "0.48914307544693797"
9) "1244405"
10) "0.53974599172088655"
11) "2199081"
12) "0.34818095160929963"
13) "3967992"
14) "0.49706414372896185"
15) "1390822"
16) "0.20662256529819331"
17) "540680"
18) "0.53718780580831582"
19) "2840317"
20) "0.20240259687939222"
21) "812937"
22) "0.16832229396956788"
23) "2181749"
24) "0.23085035776582155"
Note: cursor value of 136
.
If I run this same command on the slave node that has this slot, I get the following results.
1) "240"
2) 1) "1078146"
2) "0.19975230285488776"
3) "3788365"
4) "0.59107889142186887"
5) "3195524"
6) "0.30029325316059524"
7) "1325801"
8) "0.42741925550104209"
9) "769388"
10) "0.19214136348401703"
11) "3718988"
12) "0.22575183419216338"
13) "3580511"
14) "0.3962135436706839"
15) "380687"
16) "0.3458031319795174"
17) "1999627"
18) "0.60407053063340199"
19) "1274471"
20) "0.37309465665899166"
Note: Different values, fewer values, and different cursor offset.
While the underlying set doesn’t change, reissuing these commands on the same server does provide consistent results, but switching between servers means that you end up with an arbitrary subset of what’s actually in the sorted set. This can be demonstrated through the python library with the following (called on the same set as above with no underlying changes in between)
Using the readonly cluster client, called in quick succession:
In [41]: len(set([i for i, _ in readonly_cluster.zscan_iter("mysortedset")]))
Out[41]: 120
In [42]: len(set([i for i, _ in readonly_cluster.zscan_iter("mysortedset")]))
Out[42]: 124
In [43]: len(set([i for i, _ in readonly_cluster.zscan_iter("mysortedset")]))
Out[43]: 122
In [44]: len(set([i for i, _ in readonly_client.zscan_iter("mysortedset")]))
Out[44]: 125
Using a non-readonly cluster client, called in quick succession
In [45]: len(set([i for i, _ in cluster.zscan_iter("mysortedset")]))
Out[45]: 164
In [46]: len(set([i for i, _ in cluster.zscan_iter("mysortedset")]))
Out[46]: 164
In [47]: len(set([i for i, _ in cluster.zscan_iter("mysortedset")]))
Out[47]: 164
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:7 (5 by maintainers)
@theanti9 I am working on a fix, it should not be that difficult to solve, i just have to identify all methods that require this change and patch them inside the ReadonlyConnectionPool. Will update when i am done with it.
Just a note, all
xSCAN
style methods is either broken or do not work properly in both normal cases and in failover cases. They need another pass to sort them out and to make them work across multiple nodes in the cluster and to work with the cursor mechanism. I will put this into the 3.0.0 backlog to take another big look at allxSCAN
methods and sort them out.