Jupyter run error ReactorNotRestartable
See original GitHub issueI am able to run Scrapy in a Jupyter notebook. The first time it works fine.
However any subsequent attempts will fail with errors below.
To get it working again I must restart the python kernel. It looks like a problem of starting a reactor when one is already up and running.
from scrapy.crawler import CrawlerProcess
process = CrawlerProcess({
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogPostSpider)
process.start() # the script will block here until the crawling is finished
---------------------------------------------------------------------------
ReactorNotRestartable Traceback (most recent call last)
<ipython-input-8-6d30d0f44f41> in <module>()
13 })
14 process.crawl(spider)
---> 15 process.start() # the script will block here until the crawling is finished
16
17
/opt/conda/lib/python3.5/site-packages/scrapy/crawler.py in start(self, stop_after_crawl)
278 tp.adjustPoolsize(maxthreads=self.settings.getint('REACTOR_THREADPOOL_MAXSIZE'))
279 reactor.addSystemEventTrigger('before', 'shutdown', self.stop)
--> 280 reactor.run(installSignalHandlers=False) # blocking call
281
282 def _get_dns_resolver(self):
/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in run(self, installSignalHandlers)
1240
1241 def run(self, installSignalHandlers=True):
-> 1242 self.startRunning(installSignalHandlers=installSignalHandlers)
1243 self.mainLoop()
1244
/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self, installSignalHandlers)
1220 """
1221 self._installSignalHandlers = installSignalHandlers
-> 1222 ReactorBase.startRunning(self)
1223
1224
/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self)
728 raise error.ReactorAlreadyRunning()
729 if self._startedBefore:
--> 730 raise error.ReactorNotRestartable()
731 self._started = True
732 self._stopped = False
ReactorNotRestartable:
Issue Analytics
- State:
- Created 7 years ago
- Reactions:4
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Scrapy: fail to re-run in Jupyter Notebook script, reporting ...
The problem is that, after I run the above script, I can't run it again. Jupyter notebook returns the error ReactorNotRestartable.
Read more >ReactorNotRestartable error while connecting to the websocket
It seems to be some issue with your Anaconda environment. Websocket code is working fine in our Linux Anaconda environment on Jupyter Notebook....
Read more >Scrapy: Fail To Re-Run In Jupyter Notebook Script ... - ADocLib
I am getting the following error: You cannot restart the reactor, but you should be able to run it more CrawlerRunner() deferred runner.crawl(spider)...
Read more >Run Scrapy code from Jupyter Notebook without issues
ReactorNotRestartable error can be mitigated using this package. In this blog post, I am showing the steps that I took to run scrapy...
Read more >How to overcome Scrapy - Reactor not Restartable - Reddit
If I need to search another term, I need to restart Jupyter Notebook. Otherwise, I will receive Reactor not restartable error.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Maybe we should write a documentation page on how to use Scrapy with Jupyter Notebook for now.
has any work been done on this? How am I supposed to develop a scrapy class in a notebook if for each modification I have to restart the kernel?