SimpleHttpOperator aborts connection after 5 minutes
See original GitHub issueApache Airflow version: v1.10.4
Kubernetes version (if you are using kubernetes) (use kubectl version
):
Environment: puckel/docker-airflow
What happened:
The HTTP request from the API aborts connection after 5 minutes. I’m trying to run a long request, but everytime this error occurs:
[2020-04-06 11:44:11,217] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:44:11,217[0m] {{[34mhttp_hook.py:[0m131}} INFO[0m - Sending '[1mPOST[0m' to url: [1m{api_url_here}[0m[0m
[2020-04-06 11:49:19,249] {{logging_mixin.py:95}} WARNING - /usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py:181: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
self.log.warn(str(ex) + ' Tenacity will retry to execute the operation')
[2020-04-06 11:49:19,250] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:49:19,250[0m] {{[34mhttp_hook.py:[0m181}} WARNING[0m - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) Tenacity will retry to execute the operation[0m
[2020-04-06 11:49:19,250] {{taskinstance.py:1047}} ERROR - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 641, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/http_operator.py", line 92, in execute
self.extra_options)
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 132, in run
return self.run_and_check(session, prepped_request, extra_options)
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 182, in run_and_check
raise ex
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 174, in run_and_check
allow_redirects=extra_options.get("allow_redirects", True))
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
What you expected to happen: The operator should wait for the request response without timeout
How to reproduce it: Run a long request using SimpleHttpOperator
Anything else we need to know:
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (8 by maintainers)
Top Results From Across the Web
HTTP Operators - Apache Airflow
The following code examples use the http_default connection which means ... Use the SimpleHttpOperator to call HTTP requests and get the response text...
Read more >[GitHub] [airflow] pushpaksol opened a new issue #16292
When Airflow SimpleHttpOperator is executed with number of jobs the connection gets aborted at exact 5 minutes. with exception from ...
Read more >Airflow Documentation - Read the Docs
Creating a Connection with Environment Variables . ... It could say that task A times out after 5 minutes, and B can be...
Read more >How to access the response from Airflow SimpleHttpOperator ...
When we POST successfully to the Airflow /dags/{DAG-ID}/dag_runs endpoint, we receive a '200 OK” response, not a “201 Created” response as ...
Read more >13 Securing Airflow - Data Pipelines with Apache Airflow
Only necessary when managing security permissions ... Same as user but with additional permissions to view and edit connections, pools, variables, XComs, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Also anyone in this thread - if you face similar issue Please try 4.0.0rc1 HTTP provider!
You might want to increase the frequency of default settings. If that’s your firewall (or whatever is between you and the host) it seems overly agressive. I would expect closing idle connection after an hour but 5 minutes is unheard of. So maybe someone at your company has very agressive policy. If you look at the description of the KeepAlive feature, they mention that firewalls might be also agressively closing connections where keep alive does not come frequently enough.
Ideally - find out who is doing it and what are the rules, then adjust the settings. Or experiment.
Or maybe this is completely different problem. But finding out who is doing it is the only way forward (it’s not Airflow for sure).