question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

scan_iter is *MUCH slower* than keys

See original GitHub issue

Version: What redis-py and what redis version is the issue happening on? 2.10.5

Platform: What platform / version? (For example Python 3.5.1 on Windows 7 / Ubuntu 15.10 / Azure) CentOS 6.6 python 2.7.14

Description: Description of your issue, stack traces from errors and code that reproduces the issue According to https://redis.io/commands/keys:

Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases. This command is intended for debugging and special operations, such as changing your keyspace layout. Don’t use KEYS in your regular application code. If you’re looking for a way to find keys in a subset of your keyspace, consider using SCAN or sets.

I wrote the following code for a comprision between scan_iter and keys, the result was very surprising to me, scan_iter is MUCH slower than keys:

import redis
import time
redis_cache = redis.Redis(host='my_redis_host)

time_start = time.time()
for i in range(100):
    for key in redis_cache.scan_iter(match='MONITORABLE-*', count=100):
        pass
print time.time() - time_start

time_start = time.time()
for i in range(100):
    for key in redis_cache.keys('MONITORABLE-*'):
        pass
print time.time() - time_start

Result:

1.26971101761
0.0261030197144

After I changing the cound from 100 to 10, it was even slower:

11.1533768177
0.0263831615448

My redis DB is not very large (totally around 500 keys), did I do anything wrong? Could you please give me some guide on this, thank you in advance!

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

10reactions
andymccurdycommented, Dec 7, 2018

@WayneYe The problem with KEYS is that it pulls all matched keys into memory. If you have a Redis server with a lot of keys, this could be problematic and the client could run out of memory. The SCAN commands are instead pull matched keys into memory one chunk at a time.

SCAN is a memory optimization, not a CPU one.

0reactions
bowuLcommented, Jan 6, 2020

@WayneYe The problem with KEYS is that it pulls all matched keys into memory. If you have a Redis server with a lot of keys, this could be problematic and the client could run out of memory. The SCAN commands are instead pull matched keys into memory one chunk at a time.

SCAN is a memory optimization, not a CPU one.

waaa… get a very good answer

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why is my scanner so slow? - CDP Help Center
If your scanner is performing too slowly, it could be due to having additional programs being ran in the background.
Read more >
SCAN vs KEYS performance in Redis - Stack Overflow
There is no performance difference between KEYS and SCAN other than pagination (count) where the amount bytes transferred ...
Read more >
“Scanner Drag” – Explaining What May Be Coming Between ...
One of the first things we learned is that, if a scanner is going more slowly than normal – but it's still working...
Read more >
The effects of Redis SCAN on performance and how KeyDB ...
SCAN is a powerful tool for querying data, but its blocking nature can destroy performance when used heavily. KeyDB has changed the nature ......
Read more >
Resolving Slow or Halting Scanning, 'Not Enough Disk Space ...
Press and hold the CTRL and ALT keys on the computer keyboard, and then press the DEL key once to bring up the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found