Airflow CLI tasks clear command error (due to "latest" symlink?)
See original GitHub issueApache Airflow version
2.3.2 (latest released)
What happened
When running airflow tasks clear command, we get the following error.
[2022-06-07 15:59:58,353] {{dagbag.py:507}} INFO - Filling up the DagBag from /usr/local/airflow
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/airflow/__main__.py", line 38, in main
args.func(args)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/task_command.py", line 591, in task_clear
dags = get_dags(args.subdir, args.dag_id, use_regex=args.dag_regex)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 214, in get_dags
return [get_dag(subdir, dag_id)]
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 201, in get_dag
dagbag = DagBag(process_subdir(subdir))
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dagbag.py", line 130, in __init__
self.collect_dags(
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dagbag.py", line 514, in collect_dags
for filepath in list_py_file_paths(
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 305, in list_py_file_paths
file_paths.extend(find_dag_file_paths(directory, safe_mode))
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 323, in find_dag_file_paths
for file_path in find_path_from_directory(str(directory), ".airflowignore"):
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 242, in _find_path_from_directory
raise RuntimeError(
RuntimeError: Detected recursive loop when walking DAG directory /usr/local/airflow: /usr/local/airflow/logs/splunk/scheduler/2022-06-07 has appeared more than once.
Looking at this directory, I see 2022-06-07 and latest, which is a symlink to 2022-06-07.
The error is being raised from here https://github.com/apache/airflow/blob/0bf5f495d4131109fba449697adee68a62516851/airflow/utils/file.py#L242
child_process_log_directory = /usr/local/airflow/logs/splunk/scheduler in our airflow.cfg
What you think should happen instead
Clear command should run successfully.
How to reproduce
My understanding is that if you have 2022-06-07 and latest within your scheduler logging directory, and you try to clear a task, the CLI command would fail. We are overriding child_process_log_directory = /usr/local/airflow/logs/splunk/scheduler in the airflow.cfg.
Operating System
Linux
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
No response
Anything else
As a workaround, adding logs/splunk/scheduler/latest to the .airflowignore resolved the issue for us.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Comments:7 (4 by maintainers)

Top Related StackOverflow Question
No worries - this is not an urgent one - assigned you 😃
This issue has been closed because it has not received response from the issue author.