question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SimpleHttpOperator aborts connection after 5 minutes

See original GitHub issue

Apache Airflow version: v1.10.4

Kubernetes version (if you are using kubernetes) (use kubectl version):

Environment: puckel/docker-airflow

What happened:

The HTTP request from the API aborts connection after 5 minutes. I’m trying to run a long request, but everytime this error occurs:

[2020-04-06 11:44:11,217] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:44:11,217[0m] {{[34mhttp_hook.py:[0m131}} INFO[0m - Sending '[1mPOST[0m' to url: [1m{api_url_here}[0m[0m
[2020-04-06 11:49:19,249] {{logging_mixin.py:95}} WARNING - /usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py:181: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
  self.log.warn(str(ex) + ' Tenacity will retry to execute the operation')
[2020-04-06 11:49:19,250] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:49:19,250[0m] {{[34mhttp_hook.py:[0m181}} WARNING[0m - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) Tenacity will retry to execute the operation[0m
[2020-04-06 11:49:19,250] {{taskinstance.py:1047}} ERROR - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
    response.begin()
  File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 641, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
    response.begin()
  File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/http_operator.py", line 92, in execute
    self.extra_options)
  File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 132, in run
    return self.run_and_check(session, prepped_request, extra_options)
  File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 182, in run_and_check
    raise ex
  File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 174, in run_and_check
    allow_redirects=extra_options.get("allow_redirects", True))
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

What you expected to happen: The operator should wait for the request response without timeout

How to reproduce it: Run a long request using SimpleHttpOperator

Anything else we need to know:

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
potiukcommented, Jul 15, 2022

Also anyone in this thread - if you face similar issue Please try 4.0.0rc1 HTTP provider!

0reactions
potiukcommented, Jul 15, 2022

You might want to increase the frequency of default settings. If that’s your firewall (or whatever is between you and the host) it seems overly agressive. I would expect closing idle connection after an hour but 5 minutes is unheard of. So maybe someone at your company has very agressive policy. If you look at the description of the KeepAlive feature, they mention that firewalls might be also agressively closing connections where keep alive does not come frequently enough.

Ideally - find out who is doing it and what are the rules, then adjust the settings. Or experiment.

Or maybe this is completely different problem. But finding out who is doing it is the only way forward (it’s not Airflow for sure).

Read more comments on GitHub >

github_iconTop Results From Across the Web

HTTP Operators - Apache Airflow
The following code examples use the http_default connection which means ... Use the SimpleHttpOperator to call HTTP requests and get the response text...
Read more >
[GitHub] [airflow] pushpaksol opened a new issue #16292
When Airflow SimpleHttpOperator is executed with number of jobs the connection gets aborted at exact 5 minutes. with exception from ...
Read more >
Airflow Documentation - Read the Docs
Creating a Connection with Environment Variables . ... It could say that task A times out after 5 minutes, and B can be...
Read more >
How to access the response from Airflow SimpleHttpOperator ...
When we POST successfully to the Airflow /dags/{DAG-ID}/dag_runs endpoint, we receive a '200 OK” response, not a “201 Created” response as ...
Read more >
13 Securing Airflow - Data Pipelines with Apache Airflow
Only necessary when managing security permissions ... Same as user but with additional permissions to view and edit connections, pools, variables, XComs, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found