question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

scrapy bench fails in 2.6.1

See original GitHub issue

Description

Running scrapy bench in 2.6.1 gives AttributeError exception

Steps to Reproduce

  1. pip install scrapy==2.6.1 --upgrade
  2. scrapy bench

Expected behavior: [What you expect to happen] It works fine in 2.5.1:

2022-03-15 20:59:25 [scrapy.core.engine] INFO: Spider closed (closespider_timeout)

Actual behavior: [What actually happens]

2022-03-15 20:57:36 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot)
2022-03-15 20:57:36 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.12, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.2.0, Python 3.9.7 (default, Sep 16 2021, 13:09:58) - [GCC 7.5.0], pyOpenSSL 22.0.0 (OpenSSL 1.1.1m  14 Dec 2021), cryptography 36.0.1, Platform Linux-5.10.0-12-amd64-x86_64-with-glibc2.31
2022-03-15 20:57:37 [scrapy.crawler] INFO: Overridden settings:
{'CLOSESPIDER_TIMEOUT': 10, 'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO'}
2022-03-15 20:57:37 [scrapy.extensions.telnet] INFO: Telnet Password: cd24a9fdad5af2a4
2022-03-15 20:57:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.closespider.CloseSpider',
 'scrapy.extensions.logstats.LogStats']
2022-03-15 20:57:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-03-15 20:57:37 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-03-15 20:57:37 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-03-15 20:57:37 [scrapy.core.engine] INFO: Spider opened
2022-03-15 20:57:37 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:37 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-03-15 20:57:38 [scrapy.extensions.logstats] INFO: Crawled 61 pages (at 3660 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:39 [scrapy.extensions.logstats] INFO: Crawled 117 pages (at 3360 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:40 [scrapy.extensions.logstats] INFO: Crawled 165 pages (at 2880 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:41 [scrapy.extensions.logstats] INFO: Crawled 214 pages (at 2940 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:42 [scrapy.extensions.logstats] INFO: Crawled 269 pages (at 3300 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:43 [scrapy.extensions.logstats] INFO: Crawled 317 pages (at 2880 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:44 [scrapy.extensions.logstats] INFO: Crawled 358 pages (at 2460 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:45 [scrapy.extensions.logstats] INFO: Crawled 406 pages (at 2880 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:46 [scrapy.extensions.logstats] INFO: Crawled 453 pages (at 2820 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:47 [scrapy.core.engine] INFO: Closing spider (closespider_timeout)
2022-03-15 20:57:47 [scrapy.extensions.logstats] INFO: Crawled 486 pages (at 1980 pages/min), scraped 0 items (at 0 items/min)
2022-03-15 20:57:48 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 215773,
 'downloader/request_count': 501,
 'downloader/request_method_count/GET': 501,
 'downloader/response_bytes': 1458919,
 'downloader/response_count': 501,
 'downloader/response_status_count/200': 501,
 'elapsed_time_seconds': 10.63502,
 'finish_reason': 'closespider_timeout',
 'finish_time': datetime.datetime(2022, 3, 15, 19, 57, 48, 193820),
 'log_count/INFO': 20,
 'memusage/max': 62078976,
 'memusage/startup': 62078976,
 'request_depth_max': 19,
 'response_received_count': 501,
 'scheduler/dequeued': 501,
 'scheduler/dequeued/memory': 501,
 'scheduler/enqueued': 10021,
 'scheduler/enqueued/memory': 10021,
 'start_time': datetime.datetime(2022, 3, 15, 19, 57, 37, 558800)}
2022-03-15 20:57:48 [scrapy.core.engine] INFO: Spider closed (closespider_timeout)
2022-03-15 20:57:48 [scrapy.core.engine] INFO: Error while scheduling new request
Traceback (most recent call last):
  File "/home/nordange/anaconda3/envs/scraping/lib/python3.9/site-packages/twisted/internet/task.py", line 526, in _oneWorkUnit
    result = next(self._iterator)
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nordange/anaconda3/envs/scraping/lib/python3.9/site-packages/twisted/internet/defer.py", line 857, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/home/nordange/anaconda3/envs/scraping/lib/python3.9/site-packages/scrapy/core/engine.py", line 187, in <lambda>
    d.addBoth(lambda _: self.slot.nextcall.schedule())
AttributeError: 'NoneType' object has no attribute 'nextcall'

Reproduces how often: [What percentage of the time does it reproduce?] 100%

Versions

Scrapy       : 2.6.1
lxml         : 4.8.0.0
libxml2      : 2.9.12
cssselect    : 1.1.0
parsel       : 1.6.0
w3lib        : 1.22.0
Twisted      : 22.2.0
Python       : 3.9.7 (default, Sep 16 2021, 13:09:58) - [GCC 7.5.0]
pyOpenSSL    : 22.0.0 (OpenSSL 1.1.1m  14 Dec 2021)
cryptography : 36.0.1
Platform     : Linux-5.10.0-12-amd64-x86_64-with-glibc2.31

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Laertecommented, Mar 15, 2022

@Gallaecio I tested locally

Fixes the issue.

0reactions
Gallaeciocommented, May 30, 2022

Soon™

Read more comments on GitHub >

github_iconTop Results From Across the Web

Release notes — Scrapy 2.7.1 documentation
The scrapy parse -h command no longer throws an error (issue 5481, issue 5482). Scrapy 2.6.1 (2022-03-01)¶ ... Covered scrapy-bench in Benchmarking.
Read more >
python 3.x - Having problems with my Scrapy projects not ...
I am new to Scrapy and need some help. I am not able to use the command scrapy crawl project_name. The response from...
Read more >
Scrapy Documentation - Read the Docs
Scrapy (/skrepa/) is an application framework for crawling web sites and extracting structured data which can be used.
Read more >
Scrapy 2.6.1 documentation
You can check that Scrapy is installed correctly by running scrapy bench . ... scrapy check [FAILED] first_spider:parse_item >>> 'RetailPricex' field is ...
Read more >
I need help with Python Modules not being recognized ...
Scrapy --help Scrapy 2.6.1 - no active projectUsage: scrapy <command> [options] [args]Available commands: bench Run quick benchmark test ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found