Not working
See original GitHub issueFor a few days now (last it worked was on 5/6/16), the module isn’t working properly.
No exception is thrown, but there is utterly no sort of response. When I stop my program with a ctrl+c
, I get the following:
Traceback (most recent call last):
File "E:\Kissanime-dl\kissanime-dl.py", line 188, in <module>
return bs(cfscraper.create_scraper().get(url).content, 'lxml')
followed by an endless loop of
File "C:\Tools\Anaconda\lib\site-packages\requests\sessions.py", line 487, in get return self.request('GET', url, **kwargs)
File "C:\Tools\Anaconda\lib\site-packages\cfscrape\__init__.py", line 30, in request return self.solve_cf_challenge(resp, **kwargs)
File "C:\Tools\Anaconda\lib\site-packages\cfscrape\__init__.py", line 69, in solve_cf_challenge return self.get(submit_url, **kwargs)
and finally,
File "C:\Tools\Anaconda\lib\site-packages\cfscrape\__init__.py", line 36, in solve_cf_challenge time.sleep(5) #Cloudflare requires a delay before solving the challenge
The page I’m trying to scrape is KissAnime.
And here is the source code of the page.
I just noticed that somebody else had an issue with the same page, but that issue was marked closed. And my Traceback seems to be different.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:7
- Comments:18 (5 by maintainers)
Top GitHub Comments
It looks like they may actually have pushed this change specifically to confuse cloudflare-scrape, due to how we’re pulling the JS challenge with regex. I guess they’re probably not fans. (They might’ve been targeting some other closed source script I don’t know about, but it kind of seems like it’s designed to trip us up.)
I also added a few more user agents, of which one will be randomly chosen any time the module is loaded. That should help a little if they start trying to block cloudflare-scrape users in the future.
Please tell me if the most recent commit resolves the issue.
Actually now that I think about it this probably is related to the way Cloudflare does its redirects. I have a ton of security-related stuff on my browser such that all but the most simple of sites will break and after passing the JS check, it gets stuck in a redirect loop and I have to go back a page to actually see it. So this probably has something to do with
cfscrape
getting stuck in a redirect loop.