Airflow .airflowignore not handling soft link properly.
See original GitHub issueApache Airflow version
2.3.0 (latest released)
What happened
Soft link and folder under same root folder will be handled as the same relative path. Say i have dags folder which looks like this:
-dags:
-- .airflowignore
-- folder
-- soft-links-to-folder -> folder
and .airflowignore:
folder/
both folder and soft-links-to-folder will be ignored.
What you think should happen instead
Only the folder should be ignored. This is the expected behavior in airflow 2.2.4, before i upgraded. The root cause is that both _RegexpIgnoreRule and _GlobIgnoreRule is calling relative_to
method to get search path.
How to reproduce
check @tirkarthi comment for the test case.
Operating System
ubuntu
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Comments:10 (9 by maintainers)
Top Results From Across the Web
Airflow does not pick up symlinked DAGs - Stack Overflow
I want to link in some DAGs from a directory outside of my dags_folder . How ever when I create a symlink using...
Read more >Best Practices - Apache Airflow
Airflow scheduler tries to continuously make sure that what you have in DAGs is correctly reflected in scheduled tasks. Specifically you should not...
Read more >Airflow Documentation - Read the Docs
Airflow is a platform to programmatically author, schedule and monitor workflows. Use airflow to author workflows as directed acyclic graphs ...
Read more >Troubleshooting Airflow scheduler issues | Cloud Composer
DAG parsing and scheduling in Cloud Composer 1 and Airflow 1 ... in the queue and for some reason it's not possible to...
Read more >Manage DAG and task dependencies in Airflow
In Airflow, your pipelines are defined as Directed Acyclic Graphs (DAGs). ... Downstream task: A dependent task that cannot run until an upstream...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Here is a sample test case of the report. I guess this is more with the fact
resolve
is called before match that will resolve symlink to the original folder here. As per report “soft-links-to-folder” will resolve to “folder” and get ignoredhttps://github.com/apache/airflow/blob/6cc41abf6912fd2705b9ef7cf368c888c43c8af8/airflow/utils/file.py#L68
Test case passes before changes in https://github.com/apache/airflow/pull/22051 . cc : @ianbuss
Have prepped a simple initial PR which should hopefully restore the original behaviour (and includes the test case provided by @tirkarthi - thanks!) but would be good to get some additional eyes on it. If we want to make larger changes to the symlink handling that should perhaps be a future PR with further thought? Depends on the timeline of 2.3.1 I think.