Cache warm-ups never succeed
See original GitHub issueGlobally: the cache warm-up tasks launched by Celery workers all silently fail. Indeed, they perform GET
s on the main server’s URL without providing the required authentication. However, dashboards may not be loaded without being logged in.
Related bugs:
- unit tests on this feature miss the error
- the documentation should mention that the Celery worker needs the
--beat
flag to listen on CeleryBeat schedules (cfdocker-compose.yml
configuration)
At stake: long dashboard load times for our users, or outdated dashboards.
Main files to be fixed:
superset/tasks/cache.py
Expected results
When the Celery worker logs this (notice 'errors': []
):
superset-worker_1 | [2020-04-20 13:05:00,299: INFO/ForkPoolWorker-3] Task cache-warmup[73c09754-4dcb-4674-9ac2-087b04b6e209]
succeeded in 0.1351924880000297s:
{'success': [
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2031%7D',
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2032%7D',
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2033%7D'],
'errors': []}
… we would expect to have something (more or less) like this in the Superset server logs:
superset_1 | 172.20.0.6 - - [2020-04-20 13:05:00,049] "POST /superset/explore_json/?form_data=%7B%22slice_id%22%3A HTTP/1.1"
200 738 "http://superset:8088/superset/dashboard/1/" "python-urllib2"
Of course, we also hope to have a bunch of items in the Redis logs, and that loading dashboards is lightning-quick.
Actual results
But we get these logs instead, which show there is a 302 redirect to the login page, followed by a 200 on the login page. This redirect is interpreted as a success by the tests.
superset_1 | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET /superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1"
302 -
superset_1 | INFO:werkzeug:172.20.0.6 - - [20/Apr/2020 08:12:00] "GET /superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1"
302 -
superset_1 | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET /login/?next=http%3A%2F%2Fsuperset%3A8088%2Fsuperset%2Fexplore%2F%3Fform_data%3D%257B%2522slice_id%2522%253A%252030%257D HTTP/1.1"
200 -
(I added a few line returns)
In the Redis, here is the only stored key:
$ docker-compose exec redis redis-cli
127.0.0.1:6379> KEYS *
1) "_kombu.binding.celery"
Last, the dashboards take time loading the data on the first connection.
Screenshots
None
How to reproduce the bug
I had to patch the master branch to get this to work. In particular, I have to admit it was not very clear to me whether the config was read from file docker/pythonpath_dev/superset_config.py
or file superset/config.py
. So I kind of adapted superset/config.py
and copied it over to the pythonpath
one (which looks like it is read by the celery worker, but not the server).
Anyway, this reproduces the bug:
$ docker system prune --all
to remove all dangling images, exited containers and volumes.$ git checkout master && git pull origin master
$ wget -O configs.patch https://gist.githubusercontent.com/Pinimo/c339ea828974d2141423b6ae64192aa4/raw/e449c97c11f81f7270d6e0b2369d55ec41b079a9/0001-bug-Patch-master-to-reproduce-sweetly-the-cache-warm.patch && git apply configs.patch
This will apply patches to master to make the scenario work out neatly, in particular add the--beat
flag and specify a cache warmup task on all dashboards every minute.$ docker-compose up -d
- Wait for the containers to be built and up.
$ docker-compose logs superset-worker | grep cache-warmup
$ docker-compose logs superset | grep slice
$ docker-compose exec redis redis-cli
then typeKEYS *
Environment
(please complete the following information):
- superset version: 0.36.0
- python version: dockerized
- node.js version: dockerized
- npm version: dockerized
Checklist
- I have checked the superset logs for python stacktraces and included it here as text if there are any.
- I have reproduced the issue with at least the latest released version of superset.
- I have checked the issue tracker for the same issue and I haven’t found one similar.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:6
- Comments:27 (9 by maintainers)
Top GitHub Comments
Any news or workarounds for avoiding the 302 to the login endpoint?
I still run into this issue using latest docker image (warm-up succeeds on worker, superset logs show redirect to login, no caches refreshed). Not being able to warm-up caches periodically feels like a missing vital feature.
@ajwhite @betodealmeida I sent a PR to address this issue, which is working in my environment.