question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Celery worker support for Druid Queries

See original GitHub issue

I set up a superset cluster consisting of superset web server, redis (as broker, result backend and cache) and celery workers. Everything works fine within SQL Lab: The celery worker receive the tasks, write in the result backend and fill the cache.

On the other hand Druid queries via “Sources” --> “Datasources” are handled by the superset web server and not by the celery workers. Ist this normal behavior?

Did i miss a configuration that forces this kind of queries to use the celery workers as well?

Expected results

Druid queries via “Sources” --> “Datasources” are handled by the celery workers

Actual results

Druid queries via “Sources” --> “Datasources” are handled by the superset server

Environment

(please complete the following information):

  • superset version: 0.28.1
  • python version: 3.6.8
  • doker image: amancevice/superset:0.28.1

superset_config.py:


import os

from werkzeug.contrib.cache import RedisCache

if ‘SUPERSET_HOME’ in os.environ: DATA_DIR = os.environ[‘SUPERSET_HOME’] else: DATA_DIR = os.path.join(os.path.expanduser(‘~’), ‘.superset’)

LOG_FORMAT = “%(asctime)s:%(levelname)s:%(name)s:%(message)s” LOG_LEVEL = “DEBUG”

ENABLE_TIME_ROTATE = False TIME_ROTATE_LOG_LEVEL = “DEBUG” FILENAME = os.path.join(DATA_DIR, “superset.log”) ROLLOVER = “midnight” INTERVAL = 1 BACKUP_COUNT = 30

ROW_LIMIT = 50000 VIZ_ROW_LIMIT = 5000 FILTER_SELECT_ROW_LIMIT = 1000 SQLALCHEMY_TRACK_MODIFICATIONS = True SUPERSET_WEBSERVER_TIMEOUT = 60

QUERY_SEARCH_LIMIT = 1000

POSTGRES_SERVER_URL = os.getenv(‘POSTGRES_SERVER_URL’, ‘’) POSTGRES_DB = os.getenv(‘POSTGRES_DB’, ‘’) POSTGRES_USER = os.getenv(‘POSTGRES_USER’, ‘’) POSTGRES_PASSWORD = os.getenv(‘POSTGRES_PASSWORD’, ‘’)

SUPERSET_SQLALCHEMY_DATABASE_URI = “”.join([‘postgresql+psycopg2://’, POSTGRES_USER, ‘:’, POSTGRES_PASSWORD, ‘@’, POSTGRES_SERVER_URL, ‘/’, POSTGRES_DB])

SQLALCHEMY_DATABASE_URI = SUPERSET_SQLALCHEMY_DATABASE_URI

CACHE_CONFIG = { ‘CACHE_TYPE’: ‘redis’, ‘CACHE_DEFAULT_TIMEOUT’: os.getenv(‘CACHE_DEFAULT_TIMEOUT’, ‘’), ‘CACHE_KEY_PREFIX’: ‘superset_cache’, ‘CACHE_REDIS_URL’: os.getenv(‘CACHE_REDIS_URL’, ‘’) }

MAPBOX_API_KEY = os.environ.get(‘MAPBOX_API_KEY’, ‘’)

class CeleryConfig(object): BROKER_URL = os.getenv(‘BROKER_URL’, ‘’) CELERY_IMPORTS = (‘superset.sql_lab’) CELERY_RESULT_BACKEND = os.getenv(‘CELERY_RESULT_BACKEND’, ‘’) CELERY_ANNOTATIONS = {‘tasks.add’: {‘rate_limit’: ‘10/s’}} CELERYD_TASK_SOFT_TIME_LIMIT = os.getenv(‘CELERYD_TASK_SOFT_TIME_LIMIT’, ‘’) CELERYD_TASK_TIME_LIMIT = os.getenv(‘CELERYD_TASK_TIME_LIMIT’, ‘’) # 30 min CELERYD_MAX_TASKS_PER_CHILD = os.getenv(‘CELERYD_MAX_TASKS_PER_CHILD’, ‘’) CELERYD_LOG_LEVEL = os.getenv(‘CELERYD_LOG_LEVEL’, ‘’) CELERYD_PREFETCH_MULTIPLIER = os.getenv(‘CELERYD_PREFETCH_MULTIPLIER’, ‘’) CELERY_ACKS_LATE = True CELERY_SEND_EVENTS = True CELERY_CONFIG = CeleryConfig RESULTS_BACKEND = RedisCache( host=os.getenv(‘RESULTS_BACKEND_HOST’, ‘’), port=os.getenv(‘RESULTS_BACKEND_PORT’, ‘’), key_prefix=‘superset_results’ )


Command to start

  • Server: gunicorn
    -w 10
    -k gevent
    –timeout 60
    -b 0.0.0.0:8088
    –limit-request-line 0
    –limit-request-field_size 0
    superset:app
  • worker: celery worker --app=superset.sql_lab:celery_app --pool=gevent -Ofair --task-events

Checklist

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven’t found one similar.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
issue-label-bot[bot]commented, Sep 4, 2019

Issue-Label Bot is automatically applying the label #enhancement to this issue, with a confidence of 0.82. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

0reactions
Stephan3555commented, Dec 20, 2019

Hi @willbarrett! Yes and its working very well. But unfortunately you have to write an SQL Query to get it working right?. And around 90% of our Superset User are not able to use SQL. And are not willing to learn it

Read more comments on GitHub >

github_iconTop Results From Across the Web

Celery worker support for Druid Queries · Issue #8171 - GitHub
I set up a superset cluster consisting of superset web server, redis (as broker, result backend and cache) and celery workers.
Read more >
[GitHub] [incubator-superset] Stephan3555 opened a new issue ...
Everything works fine within SQL Lab: The celery worker receive the tasks, write in the result backend and fill the cache. On the...
Read more >
Native queries - Apache Druid
Apache Druid supports two query languages: Druid SQL and native queries. This document describes the native query language. For information about how Druid...
Read more >
airbnb/superset - Gitter
I'll look into postgresql functions that might be able to help me deal with ... @aaronbannin async mode require celery worker for query...
Read more >
SQL Lab — Apache Superset documentation - Read the Docs
Support for long-running queries. uses the Celery distributed queue to dispatch query handling to workers. supports defining a “results backend” to persist ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found