Ability to better support odd scheduling time
See original GitHub issueDescription
So I have been encountering odd time schedule request recently, say some updates need to be done on every second and third Friday each month, 4pm.
Tried to search for solution but could not find a perfect ones, I can think of the following solutions:
- use a cron and a separate DAG
- embed trigger in the actual task logic, that is to check whether the code should be executed in that date, otherwise return success (equivalent to skip and mark success)
Drawbacks (at least the one i can think of):
-
it creates extra DAG, and when you have a bunch of these requests, it’s not really clean. Besides, you need to add equal number of ExternalTaskSensor to preserve inter-DAG dependencies. It also however, will need to provision other dependent resources (such as Kafka consumers)
-
not very generic and reusable apparently, and if you are going for a full containerized solution - each Task is executed in container - you also need to start a container first to reach that simple if-else check, which seems a bit wasteful
Use case / motivation
Basically I have a bunch of extraction scripts that need to be executed at very different and very weird times, very straight forward (?), without the need to replicating provision tasks, too many inter-DAG dependencies
Potential solutions I have in mind:
- not touching Airflow core (workaround)
inherit and extend the Operators i am using - troublesome and needs to repeat for multiple operators
- touching Airflow core
an extra tuple params allowed_dates
passed into the operators (BaseOperator), and then inject the date check right before the operator execution, essentially giving all operator the ability to be programmatically skipped in any DAG - might be an overkill
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:11 (3 by maintainers)
Please check Timetables https://airflow.apache.org/docs/apache-airflow/stable/concepts/timetable.html If a specific feature is missing you can open a new issue and explain
I’ve started a discussion thread on this on the dev mailing list to scope out what a solution to this will look like https://lists.apache.org/thread.html/rb4e004e68574e5fb77ee5b51f4fd5bfb4b3392d884c178bc767681bf%40<dev.airflow.apache.org>
Use cases there would be ace (and feedback once we come up with a design)