question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SIP-71] API to pass metadata over to Celery workers

See original GitHub issue

[SIP-71] Proposal for API to pass metadata over to Celery workers

Motivation

There are several types of metadata required to run analytical queries. Among them are user-impersonating authentication tokens and granular source information (e.g. dashboard or chart IDs). Unfortunately, this “global” information is not available on the Celery workers since they run remotely in a different Flask context. This proposal suggests to add Superset config API to pass required “global” data over so user code on Celery worker is able to utilize it.

Proposed Change

Add CELERY_FLASK_METADATA_EXTRACTOR config function allowing to save all necessary data from flask.g, flask.request and flask.context on the request side and CELERY_FLASK_METADATA_INITIALIZER config function restoring the data on the Celery worker side.

New or Changed Public Interfaces

Superset config will just have new CELERY_FLASK_METADATA_EXTRACTOR and CELERY_FLASK_METADATA_INITIALIZER config functions.

New dependencies

None

Migration Plan and Compatibility

The API is completely backward compatible

Rejected Alternatives

  • I thought of just having CELERY_PASSOVER_METADATA config tuple with names of fields of flask.g to save/restore, but it would not allow to populate flask.g with custom fields (mb derived from flask.request or flask.session), and adding a new functionality to populate flask.g in the first place would become as complex as the proposal while being way less intuitive.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:4
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
craig-ruedacommented, Aug 20, 2021

LGTM as long as the new config options default to a no-op, which I’m sure you were planning anyways.

0reactions
cccs-RyanScommented, Dec 16, 2022

Hey @rusackas would love to revive this one and the associated PR. Should I create a new issue for it / have this one reopened?

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [superset] mistercrunch edited a comment on issue #16209 ...
... on issue #16209: [SIP-71] API to pass metadata over to Celery workers ... an inventory of all the places we read global...
Read more >
In celery, what is the appropriate way to pass contextual ...
When a task starts in the worker the content of before_task_publish 's header is in the **kwargs of push_request . celery/app/tasks.py:1000
Read more >
How Apache Airflow Distributes Jobs on Celery workers
Airflow scheduler: checks the status of the DAGs and tasks in the metadata database, create new ones if necessary and sends the tasks...
Read more >
Workers Guide — Celery 5.2.7 documentation
Changed in version 5.2: On Linux systems, Celery now supports sending KILL signal to all child processes after worker termination. This is done...
Read more >
Scaling Celery workers with RabbitMQ on Kubernetes
You could use KEDA to collect the queue length from RabbitMQ, integrate it with the Custom Metrics API and use it in your...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found