question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TLS connection fails through HTTPS proxy after CONNECT tunnel is established

See original GitHub issue

I set proxy by this code:

class HttpProxyMiddleware(object):
    def process_request(self, request, spider):
        request.meta['proxy'] = 'https://127.0.0.1:8787'

It’s error is:

scrapy.core.downloader.handlers.http11.TunnelError: Could not open CONNECT tunnel with proxy 127.0.0.1:8787

Then I test the proxy by requests: resp = requests.get('https://......', proxies={'https': 'https://127.0.0.1:8787'}) It’ work!

So, what is it happen?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
redapplecommented, Jan 12, 2017

Alright, I think I figured this one out. What happens in that Scrapy’s TunnelingTCP4ClientEndpoint does not consume all of the proxy’s response to the CONNECT request (it only checks that the status code is 200), and because TunnelingTCP4ClientEndpoint.processProxyResponse() is called with chunks of the response, the remaining bytes on the transport are fed into the TLS layer, as if sent by the server as a response to the ClientHello, but these are plain ASCII bytes and therefore OpenSSL says “No way!”

Step 0: send CONNECT

CONNECT www.example.com:443 HTTP/1.1
Host: www.example.com:443

Step 1: receive first chunk:

'HTTP/1.1 200 OK\r\nKeep-Alive'

Step 2: Scrapy says: “Cool! the proxy is ready, let’s initiate the TLS connection.” a ClientHello is sent over the TCP connection…

Step 3: there are more bytes from the proxy where the HTTP 200 came from…

': timeout=38\r\nContent-Length: 0\r\n\r\n'

Step 4: OpenSSL is not happy with these bytes (there are not a ServerHello) and aborts the connection

Simple fix: add a small buffer when reading the initial response from the proxy and detect \r\n\r\n before starting the TLS negotiation.

Advanced fix: use some HTTP parsing state machine (Twisted’s?) to do this properly.

0reactions
aiportalcommented, Jan 14, 2017

Thanks very much. 👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

The HTTP CONNECT tunnel
To solve this problem, the browser sends a HTTP request with method CONNECT and the target hostname and port number to the proxy....
Read more >
Issue 29394: Cannot tunnel TLS connection through TLS ...
The following two scenarios are working perfectly fine: 1) Establishing a TLS-secured connection to the proxy and then tunnel traffic ...
Read more >
Failing to connect to HTTPS service using HTTP tunnel proxy ...
The direct TLS handshake mostly succeeds with the remote server, only the server certificate cannot be validated: "SSL3_GET_SERVER_CERTIFICATE: ...
Read more >
HTTPS (HTTP Secure or HTTP over TLS) - Squid Cache Wiki
The CONNECT method is a way to tunnel any kind of connection through an HTTP proxy. By default, the proxy establishes a TCP...
Read more >
Resolve the client SSL/TLS negotiation error when connecting ...
A client TLS negotiation error means that a TLS connection initiated by the client was unable to establish a session with the load...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found