Error reading command stream when executing git
See original GitHub issueApache Airflow version: 2.1.0
Environment:
- OS: CentOS 7
- Kernel: Linux app-tddo-pc1.fhm.de 3.10.0-1127.10.1.el7.x86_64 #1 SMP Wed Jun 3 14:28:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
- Install tools: pip
- Others:
What happened:
I have several Pods with Alpine OS, python 3.8 and AirFlow 2.1.0 that work perfectly.
I have some DAGs that perform a git clone <repo_url>
in a BashOperator and work well.
Now I have AirFlow 2.1.0 on a CentOS server (no VM) with Python 3.8 (but I tried both Python 3.6.x and 3.9.x)
The git clone <repo_url>
doesn’t work in this new installation. I always get the following error:
Error reading command stream
When using the BashOperator, this is the error message:
{subprocess.py:79} INFO - Error reading command stream
I tried several ways to perform the git clone
:
- BashOperator
- PythonOperator using either the os or the subprocess libraries
- PythonOperator calling GitPython: git.Git(“dir”).clone(“repo_url”) (it executes a git clone command)
However, I can successfully perform the git clone
in all the mentioned ways (os, subrocess and GitPython) from the python console on the same server and from the same user running AirFlow.
What you expected to happen:
git clone
is successful
How to reproduce it:
Install AirFlow on CentOS7 and create a DAG that execute a git clone
in a BashOperator.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
I tried with cron and it works. I also tried a
git pull
from the same DAG and it works. It looks like only git clone doesn’t work. But I managed to solve the issue: the delivered git 1.8.3.1 looks too old, after upgrading to 2.30.1 everything looks fine. Interactive/non-interactive, passwords, etc. don’t play any role cause git executed through the BashOperator can use user/passwd defined in ~/.git-credentials.There are plenty of reasons why such connectivity might fail, but they are environmental problem, not airflow. Finally, it was the git version in CentOS7.
Regards
One reason might be also that your host is not in “known-host” file - which happens when you first time connect to the host. Then it will ask you whether it’s ok to add it (and exit on airflow because there is no-one to confirm it) this can be solved by adding the host key manually or using
-o StrictHostKeyChecking=no
.Yet another reason might be that the server closes the connection for some reason (wrong IP/wrong proxy etc.). then you need to look for the reason in the server logs.
There are plenty of reasons why such connectivity might fail, but they are environmental problem, not airflow.