question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Coupling random user_agent (scrapy_fake_useragent) extension with cfscrape

See original GitHub issue

I try to make a scrapy script which use cfscrape, privoxy, and scrapy_fake_useragent, but i have some difficulties to understand how i can pass the random user_agent generated by scrapy_fake_useragent extension to your great cfscrape extension 😃

I crosspost this issue/question on scrapy-fake-useragent issue and stackOverflow with more explanation because i think other person can be interested by this coupling :

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
reymancommented, Jan 13, 2017

It’s not the best solution, but it’s work like that :

from fake_useragent import UserAgent
   
class AirportsSpider(scrapy.Spider):
    name = "airports"
    start_urls = ['https://...']
    allowed_domains = ['...']
    ua = UserAgent()

    ...
    def start_requests(self):
        cf_requests = []
        user_agent = self.ua.random
        self.logger.info("RANDOM user_agent = %s", user_agent)
        for url in self.start_urls:
            token , agent = cfscrape.get_tokens(url,user_agent)
            self.logger.info("token = %s", token)
            self.logger.info("agent = %s", agent)

            cf_requests.append(scrapy.Request(url=url,
                                              cookies= token,
                                              headers={'User-Agent': agent}))
        return cf_requests
1reaction
reymancommented, Jan 12, 2017

Thanks @Xonshiz it’s seems this is a good alternative solution, but i need to rewrite all my scrapy code i suppose 😕 And i lost the capacity offered by scrapy_fake_useragent. If ask the question to dev of the extension to know if it’s possible to catch the random user_agent value generated before using it into cfscrape on project issue page.

Read more comments on GitHub >

github_iconTop Results From Across the Web

scrapy-fake-useragent and cfscrape cloudfare anti bot library #9
My problem is that cfscrape define a random user_agent from a limited list directly writted in the code (see here ) if no...
Read more >
Share USER_AGENT between scrapy_fake_useragent and ...
I'm using cfscrape python extension to bypass cloudfare protection with scrapy and scrapy_fake_useragent to inject random real USER_AGENT ...
Read more >
cfscrape - PyPI
cloudflare-scrape. A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found