question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get all the items (via Signal API)

See original GitHub issue

Hello there,

I wrote the following code:

 process = CrawlerProcess()
 results = []

 def crawler_results(parse_result):
      results.append(parse_result)

 # The line stuff are some params, not interesting ;)
 process.crawl(BASpider, self.server, line[0], line[2], line[1])
 for p in process.crawlers:
     p.signals.connect(crawler_results, signal=scrapy.signals.item_dropped)
 process.start()

But the method crawler_results is never triggered and I do not understand why. Is it not supposed to work like this? (I mean, it’s not working with item_scrapped or engine_started either) I just get no events triggered

Else, how would your retrieve the items of a spider with CrawlerProcess? (having a pipe is not something handy because I need to send all the data at once).

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
kmikecommented, Feb 8, 2018

Hey @lolobosse! I can’t reproduce the issue. The following script works for me, i.e. it prints results:

# -*- coding: utf-8 -*-
import scrapy
import scrapy.signals
from scrapy.crawler import CrawlerProcess


class BASpider(scrapy.Spider):
    name = 'ba'
    start_urls = ['http://example.com']

    def parse(self, response):
        yield {'url': response.url}


process = CrawlerProcess()
results = []

def crawler_results(item, response, spider):
    results.append(item)
    print(results)

process.crawl(BASpider)
for p in process.crawlers:
    p.signals.connect(crawler_results, signal=scrapy.signals.item_scraped)

process.start()

Maybe the problem is that you’re not using a right signature for a signal handler - see item_dropped argument names here: https://doc.scrapy.org/en/latest/topics/signals.html#item-dropped. Note that you must use the same argument names as in docs, because they are passed as keyword arguments.

0reactions
kmikecommented, Feb 12, 2018

@lolobosse can we close this issue? Was it an issue with signal argument names?

Read more comments on GitHub >

github_iconTop Results From Across the Web

One Signal REST API - Get the list of templates - Stack Overflow
got to OneSignal console · select the app · select Messages > Templates · select the template that you want to know its...
Read more >
Triggering Signals through API | Online Help - Zoho CRM - Zoho
Learn more about triggering custom signals through an API and receiving notifications in Zoho CRM.
Read more >
API Docs - Signal Sciences Help Center - Fastly Documentation
Fetches a list of all Agent Keys for a site. A site should only have one set of agent keys, and is limited...
Read more >
Messaging Statistics | SignalWire Developer Portal
Using the List Messages endpoint, through client.messages.list , and specifying only the dateSentAfter parameter, we get a list of all messages in this...
Read more >
Signal and GIPHY
The Signal app negotiates TLS through the proxied TCP connection all the way to the GIPHY HTTPS API endpoint. Since communication is done...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found