Limit task concurrency per worker?
See original GitHub issueDescription
Provide the ability to limit task concurrency per worker.
Use case / motivation
My use case is that I have a particularly heavy task - one that uses lots of RAM & GPU - where if too many instances of that task are running on the same machine at a time, it’ll crash. My ideal situation is to have a flag on the operator, something like task_concurrency_per_worker, that’ll guarantee at most N instances of a particular task is running on that worker at a time.
For example, if I trigger 4 instances of DAG A right now, even with 4 workers and task_concurrency = 4 on the operator, I believe there’s no guarantee that each worker will receive at most 1 instance of the task, and hence it could end up with e.g. 2 instances on worker #1 and 2 instances on worker #2.
Another heavy-handed solution would be reducing worker_concurrency, but that would restrict worker concurrency for all tasks & DAGs, and so isn’t ideal as it’s overly restrictive.
Said another way, this feature request is to basically combine task_concurrency on the operator and worker_concurrency to make a task-specific worker concurrency.
I primarily work with the CeleryExecutor; I’m not famiiiliar with the other non-local executors to know if this is a reasonable request for those executor types.
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (4 by maintainers)

Top Related StackOverflow Question
My problem was a little different, but the same approach could work.
You’d need to plug into Celery and direct jobs to workers with capacity using per-worker queues. To make this work across schedulers and workers, I’ve so far used Redis to share bookkeeping information and to ensure consistency when multiple clients update that information.
I’d set each worker capacity to its concurrency level, and default tasks to a cost of 1, your expensive tasks can then use larger values to reserve a certain ‘chunk’ of your worker resources.
You can re-route tasks in Celery by hooking into the
task_routesconfiguration. If you set this to a function, that function is called for every routing decision. In Airflow, you can set this hook by supplying a customCELERY_CONFIGdictionary:and in a separate module (as celery should not try to import it until after configuration is complete and the task router is actually needed):
and in
airflow.cfg, setcelery_config_options:You can then use Celery signal handlers to maintain worker capacity. It’ll depend on how you get your task ‘size’ data what hooks you’d need to use, but if I assume a hardcoded map then you’d use:
celeryd_after_setupto generate a worker queue name to listen to.worker_readyto add the worker queue to Redis with total worker capacity.worker_shutting_downto remove the worker from Redis altogether.task_postrunto return task size back to the worker capacity level.The
task_routeris responsible for reducing the available worker capacity, you want to do this as soon as you make a routing decision so no further tasks are sent to a worker that is already committed to a workload.In Redis, store the capacity in a sorted set (
ZADD worker-capacity [worker-queue] [worker capacity]) so you can quickly access the worker with the most capacity. Use Redis 5.0 or newer so you can useZPOPMAXto get the least-loaded worker available. Unfortunately there is no way to both get the worker with most capacity and decrement its capacity in one command, so use a pipeline withWATCH:Finally, if you are already using Redis is as your Celery broker, you can reuse the Celery connection pool for these tasks. This would differ if you are using RabbitMQ, you’ll have to maintain your own connection pool or use a different method of sharing this information between different components.
To reuse the connection pool, get access to the celery app. This will depend on the specific signal handler or if this is inside the router function. In the latter, the
taskobject has anappattribute, for example, while the worker signal hooks will be passed in aworkerinstance object, which again has anappattribute. Given the Celery app inapp, you can then use:You may have to increase the Celery
broker_pool_limitconfiguration however, depending on how busy your cluster gets.One of these days I may actually write that blog post on this subject I was planning, but the above will have to do for now.
@mjpieters thanks for the great explanation!