question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SSL website. `twisted.internet.error.ConnectionLost`

See original GitHub issue

Hi everybody! I catch this error on both OS. This HTTPS site can’t be downloaded via scrapy (twisted). I looked on this issue board and I don’t found solution.

Both: Debian 9 / Mac OS

$ scrapy shell "https://wwwnet1.state.nj.us/"
2017-09-07 16:23:02 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-09-07 16:23:02 [scrapy.utils.log] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0, 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter'}
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-09-07 16:23:03 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-09-07 16:23:03 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-07 16:23:03 [scrapy.core.engine] INFO: Spider opened
2017-09-07 16:23:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-09-07 16:23:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 2 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-09-07 16:23:04 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://wwwnet1.state.nj.us/> (failed 3 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
Traceback (most recent call last):
  File "scrapy", line 11, in <module>
    sys.exit(execute())
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 149, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 156, in _run_command
    cmd.run(args, opts)
  File "/lib/python3.5/site-packages/scrapy/commands/shell.py", line 73, in run
    shell.start(url=url, redirect=not opts.no_redirect)
  File "/lib/python3.5/site-packages/scrapy/shell.py", line 48, in start
    self.fetch(url, spider, redirect=redirect)
  File "/lib/python3.5/site-packages/scrapy/shell.py", line 115, in fetch
    reactor, self._schedule, request, spider)
  File "/lib/python3.5/site-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
    result.raiseException()
  File "/lib/python3.5/site-packages/twisted/python/failure.py", line 385, in raiseException
    raise self.value.with_traceback(self.tb)
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

Mac OSx:

$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.4
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.9.0rc1
Python    : 3.5.1 (default, Jan 22 2016, 08:54:32) - [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)]
pyOpenSSL : 17.2.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Darwin-16.7.0-x86_64-i386-64bit

Debian 9:

$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.3
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.9.0rc1
Python    : 3.4.2 (default, Oct  8 2014, 10:45:20) - [GCC 4.9.1]
pyOpenSSL : 17.2.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Linux-3.16.0-4-amd64-x86_64-with-debian-8.7

Mac OSx:

$ openssl s_client -connect wwwnet1.state.nj.us:443 -servername wwwnet1.state.nj.us
CONNECTED(00000003)
140736760988680:error:140790E5:SSL routines:ssl23_write:ssl handshake failure:s23_lib.c:177:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 336 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : 0000
    Session-ID: 
    Session-ID-ctx: 
    Master-Key: 
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1504790705
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---

Debian 9:

CONNECTED(00000003)
---
Certificate chain
 0 s:/C=US/ST=New Jersey/L=Trenton/O=New Jersey State Government/OU=E-Gov Services - wwwnet1.state.nj.us/CN=wwwnet1.state.nj.us
   i:/C=US/O=Symantec Corporation/OU=Symantec Trust Network/CN=Symantec Class 3 Secure Server SHA256 SSL CA
---
Server certificate
-----BEGIN CERTIFICATE-----
<cut out>
-----END CERTIFICATE-----
<cut out>
---
No client certificate CA names sent
---
SSL handshake has read 1724 bytes and written 635 bytes
---
New, TLSv1/SSLv3, Cipher is DES-CBC3-SHA
Server public key is 2048 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : DES-CBC3-SHA
    Session-ID: 930F00007F5944DC3C6010F96E95E7FA63656EF5EA35508B055078CEC249DC38
    Session-ID-ctx:
    Master-Key: 27B02D427F006A57B121CCEFEAA7F33B870DE262848BB6F851242F48F051ABB77BA4ED06706766EE8EE55F6643C9FF55
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1504790821
    Timeout   : 300 (sec)
    Verify return code: 21 (unable to verify the first certificate)
---

Thanks you for your time.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:17 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
redapplecommented, Sep 7, 2017

This worked for me:

  • force TLS 1.0
  • use cryptography<2 (e.g. 1.9 in my case, before OpenSSL 1.1)
$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.3
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.5.0
Python    : 3.6.2 (default, Aug 24 2017, 10:48:24) - [GCC 6.3.0 20170406]
pyOpenSSL : 17.2.0 (OpenSSL 1.0.2g  1 Mar 2016)


$ pip freeze
asn1crypto==0.22.0
attrs==17.2.0
Automat==0.6.0
cffi==1.10.0
constantly==15.1.0
cryptography==1.9
cssselect==1.0.1
hyperlink==17.3.1
idna==2.6
incremental==17.5.0
lxml==3.8.0
parsel==1.2.0
pyasn1==0.3.3
pyasn1-modules==0.1.1
pycparser==2.18
PyDispatcher==2.0.5
pyOpenSSL==17.2.0
queuelib==1.4.2
Scrapy==1.4.0
service-identity==17.0.0
six==1.10.0
Twisted==17.5.0
w3lib==1.18.0
zope.interface==4.4.2

$ scrapy shell "https://wwwnet1.state.nj.us/" -s DOWNLOADER_CLIENT_TLS_METHOD=TLSv1.0
2017-09-07 17:45:49 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-09-07 17:45:49 [scrapy.utils.log] INFO: Overridden settings: {'DOWNLOADER_CLIENT_TLS_METHOD': 'TLSv1.0', 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0}
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-09-07 17:45:49 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-07 17:45:49 [scrapy.core.engine] INFO: Spider opened
2017-09-07 17:45:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wwwnet1.state.nj.us/> (referer: None)
[s] Available Scrapy objects:
[s]   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s]   crawler    <scrapy.crawler.Crawler object at 0x7f24fb802ac8>
[s]   item       {}
[s]   request    <GET https://wwwnet1.state.nj.us/>
[s]   response   <200 https://wwwnet1.state.nj.us/>
[s]   settings   <scrapy.settings.Settings object at 0x7f24f314d9e8>
[s]   spider     <DefaultSpider 'default' at 0x7f24f24ba7b8>
[s] Useful shortcuts:
[s]   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
[s]   fetch(req)                  Fetch a scrapy.Request and update local objects 
[s]   shelp()           Shell help (print this help)
[s]   view(response)    View response in a browser
>>> 

Using OpenSSL 1.1.0f (with cryptography==2.0.3), did not work for me, even when forcing TLS1.0

2reactions
russian-developercommented, Jun 13, 2020

I tried all the suggestions above but still didn’t manage to fix this problem. URL: https://www.diariooficial.feiradesantana.ba.gov.br/

scrapy==2.0.0
Twisted==20.3.0
pyOpenSSL==19.1.0

Any words of wisdom are much appreciated. 🙏

@anapaulagomes you have to use TLSv1.0 and RC4-MD5 cihper. The next command should work in the scraper environment curl -v --tlsv1.0 --ciphers RC4-MD5 https://www.diariooficial.feiradesantana.ba.gov.br/ You can reach it by compiling the OpenSSL with support SSLv3.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to prevent a twisted.internet.error.ConnectionLost error ...
You need to set a user-agent string. It seems some websites don't like it and block when your user agent is not a...
Read more >
Developers - SSL website. `twisted.internet.error.ConnectionLost` -
Hi everybody! I catch this error on both OS. This HTTPS site can't be downloaded via scrapy (twisted). I looked on this issue...
Read more >
twisted.internet.error : API documentation
Given a socket exception, return connection error. Class, ConnectionClosed, Connection was closed, whether cleanly or non-cleanly. Class, ConnectionLost ...
Read more >
Using TLS in Twisted — Twisted 16.2.0 documentation
PrivateCertificate instances by calling twisted.internet.ssl. ... to accept any connection so long as the server's certificate is signed by at least one of ......
Read more >
Connection to the other side was lost in a non clean fashion.
python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>].
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found