how to run Multiple spider ** independently ** at the same time ?
See original GitHub issuewhen i wrote two simple spider
class Bing(Spider):
name = "bing"
def start_requests(self):
for _ in range(1000):
yield Request("http://bing.com", dont_filter=True)
def parse(self, response):
print(self.name, response.url)
class Sogou(Spider):
name = "sogou"
def start_requests(self):
for _ in range(1000):
yield Request('http://sogou.com', dont_filter=True)
def parse(self, response):
while True:
print('123')
print(self.name, response.url)
def run(spider_cls):
crawler = Crawler(spider_cls)
crawler.crawl()
run(Bing)
run(Sogou)
reactor.run()
just like this. Use Crawler and reactor to run these spider. Sogou spider was blocked, it was obvious. but Bing spider was blocked too, why ? and how to solve it . use Crawler class make multiple spider run independently at the same time? I used to use twisted.threads like deferToThread , or callInThread , it couldn’t work . Can someone have some idea? Thank you very much!
Issue Analytics
- State:
- Created 5 years ago
- Reactions:3
- Comments:12 (6 by maintainers)
Top Results From Across the Web
Do we have an option to run multiple spiders "independently ...
I currently running 300+ spiders sequentially and dynamically from a django database. I also tried to run the spiders simultaneously using ...
Read more >how to run multiple spiders concurrently in code?
Short answer: Running multiple spiders into the same scrapy crawl process is no longer supported (since 0.14) in favour of using scrapyd
Read more >Common Practices — Scrapy 2.7.1 documentation
The first utility you can use to run your spiders is scrapy.crawler.CrawlerProcess . ... Here is an example that runs multiple spiders simultaneously:....
Read more >Run Scrapy Spiders from Python Script - YouTube
Learn how to call Scrapy spider from main.py, a question that I get often. You will learn how to run Scrapy multiple spiders...
Read more >How to Run Scrapy as a Stand-Alone Script | Teracrawler
Let's take a simple scrapy crawler that crawls quotes and see if we can make it run standalone… import scrapy from scrapy.spiders import ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
hi, @xiaochonzi
Please refer to the section Running multiple spiders in the same process in the document Common Practices — Scrapy 1.5.0 documentation. There is a very clear example for your question.
Hi, @xiaochonzi
Please, please read the section I mentioned before CAREFULLY. The document provides a very clear way to do what you want.