SSL issue when scraping website
See original GitHub issueI have a spider that’s throwing the following error when trying to crawl this URL.
>>> fetch('https://vconnections.org/resources')
2015-08-12 10:07:28 [scrapy] INFO: Spider opened
2015-08-12 10:07:28 [scrapy] DEBUG: Retrying <GET https://vconnections.org/resources> (failed 1 times): [<twisted.python.failure.Failure <class 'OpenSSL.SSL.Error'>>]
2015-08-12 10:07:33 [scrapy] DEBUG: Gave up retrying <GET https://vconnections.org/resources> (failed 2 times): [<twisted.python.failure.Failure <class 'OpenSSL.SSL.Error'>>]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/Users/gmeans/.virtualenvs/backlink/lib/python2.7/site-packages/scrapy/shell.py", line 87, in fetch
reactor, self._schedule, request, spider)
File "/Users/gmeans/.virtualenvs/backlink/lib/python2.7/site-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "<string>", line 2, in raiseException
ResponseNeverReceived: [<twisted.python.failure.Failure <class 'OpenSSL.SSL.Error'>>]
Other SSL urls work fine, and I tried implementing the solution from this previous issue:
https://github.com/scrapy/scrapy/issues/981
class CustomContextFactory(ScrapyClientContextFactory):
def getContext(self, hostname=None, port=None):
ctx = ClientContextFactory.getContext(self)
# Enable all workarounds to SSL bugs as documented by
# http://www.openssl.org/docs/ssl/SSL_CTX_set_options.html
ctx.set_options(SSL.OP_ALL)
if hostname:
ClientTLSOptions(hostname, ctx)
return ctx
Scrapy==1.0.3 Twisted==15.3.0 pyOpenSSL==0.15.1
OpenSSL 1.0.1k 8 Jan 2015
Any ideas on what else I could try? Thanks!
Issue Analytics
- State:
- Created 8 years ago
- Comments:29 (9 by maintainers)
Top Results From Across the Web
Ask Question - Stack Overflow
This means that the server configuration is wrong and that not only python but several others will have problems with this site. Some...
Read more >How to Resolve SSL/TSL Certificate in Python
Worth web scraping services explain in this tutorial about SSL/TSL Certificate, how it works and how to send Python verification request.
Read more >[Python Scraping] SSL: CERTIFICATE_VERIFY_FAILED Error ...
I was practicing python scraping with urllib library to get data impossible to extract from request library , but was faced with following...
Read more >Python Web Scraping: Verify SSL certificates for HTTPS ...
Python Web Scraping: Exercise-27 with Solution. Write a Python program to verify SSL certificates for HTTPS requests using requests module.
Read more >Seeing SSL certificate error while trying to acces...
Python code for webpage scraping import. ... Seeing SSL certificate error while trying to access the Wiki page using Python web scrape.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sure @wilsoncusack .
Context Factory, really simple in the end:
Then make sure you update the settings.py:
Yes I had to update OpenSSL via Homebrew for this to work. That’s because Apple has stopped using OpenSSL and switched to their own libraries.
No side effect I’ve seen, but I did this in a virtualenv.
Just set the
DOWNLOADER_CLIENT_TLS_METHOD
property to'TLSv1.2'
in the settings.py of your project. There is no more need for you to use the custom context factory to solve this problem.