question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow multiple pools for one task

See original GitHub issue

Hello!

Description of feature:

I think it would be helpful to allow for multiple pools in one task. Currently, the pool argument for any class inheriting from BaseOperator is of type string, and thus only allows one pool to be entered. I believe it would be useful to allow to set multiple pools for one task, meaning allowing the argument pool to be a list of string instead of just one string. This would mean that a task would have to wait on a spot to be available in every one of the pools it declares, instead of in the only one pool it declares, and this would mean that a task would take up spots in every one of the pools it declares, instead of in only the one pool it declares.

Use case:

I have some tasks that require multiple resources. I cannot split the tasks into separate tasks each requiring one resource, since the tasks need the two (or more) resources at once to complete their assignment. I also have some tasks only requiring one of the resources, so I can’t create a pool for both resources. Example: Task 1 requires resource A and B Task 2 requires resource A Task 3 requires resource B Resource A can only have 4 connections. Resource B can only have 16 connections. I would need to have task 1 be in pool A and pool B, and this is not possible today since I can only specify one pool.

What would I want to happen?

Allow multiple pools in task creation. I looked into airflow source code, and it looks like the assumption that we only have one pool is deep into SQL, so I cannot just easily fork airflow and add this feature, so the change is not small and I do not have enough airflow understanding to make this change.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:17
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

7reactions
c-thielcommented, Jul 5, 2021

To motivate this a little bit further, the following use-case would also be solved with this PR:

When we use the KubernetesPodOperator, we launch pods in Namespaces. These Namespaces have limits - however Airflow is currently unaware of those. Thus, if we hit the limits, Airflow will just continue to schedule tasks which will fail immediately. Thus we should put each tasks in two pools: One representing the memory limit and one representing the CPU limit. This really would be an essential feature for larger Kubernetes deployments.

1reaction
potiukcommented, Dec 7, 2022

I would consider drafting an AIP but I don’t have the technical knowledge of Airflow’s architecture to propose or PoC an implementation. How much detail is expected from an AIP?

Rather detailed - look at the other AIPs (completed) - they are much better explanation of the level of detail that I could give here.

Just to set expectation - this is how things work in Open Source. Things get implemented, when someone implements them. If you want something implemented, you either do it, or find someone who will get an interest and implement it. This project is done in the community and run by the Apache Software Foundation rules - where anyone can contribute.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Add a Task to More Than One Pools - YouTube
This video will show you how to add a task to more than 1 pool in a business process diagram.Process modeling tool in...
Read more >
Task shared by multiple pools - Bizagi Feedback
I have a sales bid process. There is a step where the sales pool, bid management pool and tech presales pool all review...
Read more >
Create and share a resource pool in Project desktop
A resource pool helps with assiging available people across multiple projects when you need to share your resources.
Read more >
Airflow pools | Astronomer Documentation
Pools allow you to limit parallelism for an arbitrary set of tasks, allowing you to control when your tasks are run. They are...
Read more >
Can I use 2 agent pools in my azure pipelines? - Stack Overflow
Basically, you can use several agent pools in one build/release definition. You just split your definition into several jobs and assign the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found