Add virtual env creation in Celery workers
See original GitHub issueDescription
Add the option for the celery worker to create a new virtual env, install some packages, and run airflow run command inside it (based on executor_config
params).
Really nice to have - have reusable virtual env that can be shared between tasks with the same param (based on user configuration).
Use case / motivation
Once getting to a point when you want to create cluster for different types of python tasks and you’ve multiple teams working on the same cluster, you need to start splitting into different python packages the business login code to allow better versioning control and avoid the need of restarting the workers when deploying new util code. I think it would be amazing if we can allow creating new virtual envs as part of Airflow and control the package versions.
I know that PythonVirtualenvOperator
exists, but:
- Creating env related thing feels like an executor job to me, the coder should not use specific operators for it.
- The big downside to it is that if I want to use
ShortCircuitOperator
orBranchPythonOperator
or any kind of new python based operator, I’ve to create a new operator that will inherit fromPythonVirtualenvOperator
and duplicate the desired functionality.
Are you willing to submit a PR?
Yes, would love to.
Related Issues
Not that I can find.
Issue Analytics
- State:
- Created 3 years ago
- Comments:20 (19 by maintainers)
@potiuk Didn’t get the response email, weird, but anyway sent my Wiki ID now.
As I said, the issue here is that the user have to create some kind of complex deployment scripts, because running airflow inside venv requires restart when updating the packages. Anyway, I had a few extra ideas to make this feature more reliable, I sent a request to get edit access to the wiki and will write a more detailed proposal when I do.