Exception when using DummyStatsCollector
See original GitHub issueDescription
Using the DummyStatsCollector results in an exception:
2019-09-09 13:51:23 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method CoreStats.spider_closed of <scrapy.extensions.corestats.CoreStats object at 0x7f86269cac18>>
Traceback (most recent call last):
File ".../lib/python3.6/site-packages/twisted/internet/defer.py", line 150, in maybeDeferred
result = f(*args, **kw)
File ".../lib/python3.6/site-packages/pydispatch/robustapply.py", line 55, in robustApply
return receiver(*arguments, **named)
File ".../lib/python3.6/site-packages/scrapy/extensions/corestats.py", line 28, in spider_closed
elapsed_time = finish_time - self.stats.get_value('start_time')
TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'
This problem has been introduced in aa46e1995cd5cb1099aba17535372b538bd656b3.
Steps to Reproduce
Set STATS_CLASS = "scrapy.statscollectors.DummyStatsCollector"
in the settings module as described in the documentation (https://docs.scrapy.org/en/latest/topics/stats.html#dummystatscollector).
Expected behavior: no exception Actual behavior: exception thrown Reproduces how often: always
Versions
At least master as of 534de7395da3a53b5a2c89960db9ec5d8fdab60c
Fix
A possible fix is to use the elapsed time as a default argument so that get_value()
does not return None. I can prepare a PR if needed.
--- a/scrapy/extensions/corestats.py
+++ b/scrapy/extensions/corestats.py
@@ -25,7 +25,7 @@ class CoreStats(object):
def spider_closed(self, spider, reason):
finish_time = datetime.datetime.utcnow()
- elapsed_time = finish_time - self.stats.get_value('start_time')
+ elapsed_time = finish_time - self.stats.get_value('start_time', finish_time)
elapsed_time_seconds = elapsed_time.total_seconds()
self.stats.set_value('elapsed_time_seconds', elapsed_time_seconds, spider=spider)
self.stats.set_value('finish_time', finish_time, spider=spider)
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Stats Collection — Scrapy 2.7.1 documentation
The facility is called the Stats Collector, and can be accessed through the stats attribute of the Crawler API, as illustrated by the ......
Read more >HOw to get dummy scrapy stuts count in scrapyd
How do i get the the "DummyStatsCollector" in scrapyd. I have studied from this link "http://doc.scrapy.org/en/latest/topics/stats.html# ...
Read more >Scrapy - Stats Collection
DummyStatsCollector. This stats collector is very efficient which does nothing. This can be set using the STATS_CLASS setting and can be used to...
Read more >Exceptions — Scrapy 1.0.1 documentation - Huihoo
This exception is raised to indicate an unsupported feature. ... Built with Sphinx using a theme provided by Read the Docs. Read the...
Read more >org.apache.carbondata.core.datastore.page.ColumnPage. ...
Best Java code snippets using org.apache.carbondata.core.datastore.page. ... getCompressorName()), pageSize); } catch (MemoryException e) { throw new ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Makes sense, the extension should be robust enough not to break depending on the Stats class implementation. In that case, something along the following lines should do the trick:
I still think it might be a good idea to skip connecting the handlers though 😅 Let’s see what other committers think.
No specific reason that I can see, the previous code was using
utcnow
already and it just didn’t occur to me to change it. In any case, I don’t think it changes things, AFAICT all datetimes handled by Scrapy are in UTC and there are no timezone conversions. That said, I’m happy to be proven wrong. If you think there is something to be improved, please do suggest it or open a PR.