Scheduler Dying/Hanging v2.0.0: WARNING - Killing DAGFileProcessorProcess
See original GitHub issueApache Airflow version: 2.0.0
Environment: apache/airflow:2.0.0 docker image, Docker Desktop for Mac 3.0.4
What happened: The scheduler runs fine for a bit, then after a few minutes it starts spitting the following out every second (and the container appears to be stuck as it needs to be force killed):
scheduler_1 | [2021-01-12 01:45:48,149] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:49,153] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:49,159] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:50,163] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:50,165] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:51,169] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
scheduler_1 | [2021-01-12 01:45:51,172] {scheduler_job.py:262} WARNING - Killing DAGFileProcessorProcess (PID=1112)
Note that there is only a single dag enabled with a single task as I’m just trying to get this off the ground. That dag is scheduled to run daily so it’s almost never running aside from when I’m manually testing it.
What you expected to happen: The scheduler to stay idle without issues.
How to reproduce it: Unclear as it seems to happen almost randomly after a few minutes. Below is the scheduler section of my airflow.cfg
. The scheduler is using LocalExecutor
. I have the scheduler
and webserver
running in separate containers, which may or may not be related? Let me know what other information might be helpful.
[scheduler]
# Task instances listen for external kill signal (when you clear tasks
# from the CLI or the UI), this defines the frequency at which they should
# listen (in seconds).
job_heartbeat_sec = 10
# How often (in seconds) to check and tidy up 'running' TaskInstancess
# that no longer have a matching DagRun
clean_tis_without_dagrun_interval = 15.0
# The scheduler constantly tries to trigger new tasks (look at the
# scheduler section in the docs for more information). This defines
# how often the scheduler should run (in seconds).
scheduler_heartbeat_sec = 10
# The number of times to try to schedule each DAG file
# -1 indicates unlimited number
num_runs = -1
# The number of seconds to wait between consecutive DAG file processing
processor_poll_interval = 10
# after how much time (seconds) a new DAGs should be picked up from the filesystem
min_file_process_interval = 30
# How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
dag_dir_list_interval = 300
# How often should stats be printed to the logs. Setting to 0 will disable printing stats
print_stats_interval = 30
# How often (in seconds) should pool usage stats be sent to statsd (if statsd_on is enabled)
pool_metrics_interval = 5.0
# If the last scheduler heartbeat happened more than scheduler_health_check_threshold
# ago (in seconds), scheduler is considered unhealthy.
# This is used by the health check in the "/health" endpoint
scheduler_health_check_threshold = 30
# How often (in seconds) should the scheduler check for orphaned tasks and SchedulerJobs
orphaned_tasks_check_interval = 300.0
child_process_log_directory = /opt/airflow/logs/scheduler
# Local task jobs periodically heartbeat to the DB. If the job has
# not heartbeat in this many seconds, the scheduler will mark the
# associated task instance as failed and will re-schedule the task.
scheduler_zombie_task_threshold = 300
# Turn off scheduler catchup by setting this to ``False``.
# Default behavior is unchanged and
# Command Line Backfills still work, but the scheduler
# will not do scheduler catchup if this is ``False``,
# however it can be set on a per DAG basis in the
# DAG definition (catchup)
catchup_by_default = False
# This changes the batch size of queries in the scheduling main loop.
# If this is too high, SQL query performance may be impacted by one
# or more of the following:
# - reversion to full table scan
# - complexity of query predicate
# - excessive locking
# Additionally, you may hit the maximum allowable query length for your db.
# Set this to 0 for no limit (not advised)
max_tis_per_query = 512
# Should the scheduler issue ``SELECT ... FOR UPDATE`` in relevant queries.
# If this is set to False then you should not run more than a single
# scheduler at once
use_row_level_locking = True
# Max number of DAGs to create DagRuns for per scheduler loop
#
# Default: 10
# max_dagruns_to_create_per_loop =
# How many DagRuns should a scheduler examine (and lock) when scheduling
# and queuing tasks.
#
# Default: 20
# max_dagruns_per_loop_to_schedule =
# Should the Task supervisor process perform a "mini scheduler" to attempt to schedule more tasks of the
# same DAG. Leaving this on will mean tasks in the same DAG execute quicker, but might starve out other
# dags in some circumstances
#
# Default: True
# schedule_after_task_execution =
# The scheduler can run multiple processes in parallel to parse dags.
# This defines how many processes will run.
parsing_processes = 1
# Turn off scheduler use of cron intervals by setting this to False.
# DAGs submitted manually in the web UI or with trigger_dag will still run.
use_job_schedule = True
# Allow externally triggered DagRuns for Execution Dates in the future
# Only has effect if schedule_interval is set to None in DAG
allow_trigger_in_future = False
max_threads = 1
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (3 by maintainers)
Top GitHub Comments
I’m also facing the exact same issue: Airflow 2.1.4
Number of Schedulers: 1
Any luck on resolving this?
Not yet @DuyHV20150601 , I still got the same error. Maybe this https://github.com/apache/airflow/issues/17507#issuecomment-973177410 can solve the issue, but I don’t tested it yet