Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Jupyter run error ReactorNotRestartable

See original GitHub issue

I am able to run Scrapy in a Jupyter notebook. The first time it works fine.
However any subsequent attempts will fail with errors below.

To get it working again I must restart the python kernel. It looks like a problem of starting a reactor when one is already up and running.

from scrapy.crawler import CrawlerProcess
 
process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogPostSpider)
process.start() # the script will block here until the crawling is finished

---------------------------------------------------------------------------
ReactorNotRestartable                     Traceback (most recent call last)
<ipython-input-8-6d30d0f44f41> in <module>()
     13 })
     14 process.crawl(spider)
---> 15 process.start() # the script will block here until the crawling is finished
     16 
     17 

/opt/conda/lib/python3.5/site-packages/scrapy/crawler.py in start(self, stop_after_crawl)
    278         tp.adjustPoolsize(maxthreads=self.settings.getint('REACTOR_THREADPOOL_MAXSIZE'))
    279         reactor.addSystemEventTrigger('before', 'shutdown', self.stop)
--> 280         reactor.run(installSignalHandlers=False)  # blocking call
    281 
    282     def _get_dns_resolver(self):

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in run(self, installSignalHandlers)
   1240 
   1241     def run(self, installSignalHandlers=True):
-> 1242         self.startRunning(installSignalHandlers=installSignalHandlers)
   1243         self.mainLoop()
   1244 

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self, installSignalHandlers)
   1220         """
   1221         self._installSignalHandlers = installSignalHandlers
-> 1222         ReactorBase.startRunning(self)
   1223 
   1224 

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self)
    728             raise error.ReactorAlreadyRunning()
    729         if self._startedBefore:
--> 730             raise error.ReactorNotRestartable()
    731         self._started = True
    732         self._stopped = False

ReactorNotRestartable:

Issue Analytics

State:
Created 7 years ago
Reactions:4
Comments:9 (5 by maintainers)

Top GitHub Comments

8reactions

Gallaeciocommented, Jul 8, 2019

Maybe we should write a documentation page on how to use Scrapy with Jupyter Notebook for now.

1reaction

naretocommented, Aug 6, 2020

has any work been done on this? How am I supposed to develop a scrapy class in a notebook if for each modification I have to restart the kernel?

Top Results From Across the Web

Scrapy: fail to re-run in Jupyter Notebook script, reporting ...

The problem is that, after I run the above script, I can't run it again. Jupyter notebook returns the error ReactorNotRestartable.

ReactorNotRestartable error while connecting to the websocket

It seems to be some issue with your Anaconda environment. Websocket code is working fine in our Linux Anaconda environment on Jupyter Notebook....

Scrapy: Fail To Re-Run In Jupyter Notebook Script ... - ADocLib

I am getting the following error: You cannot restart the reactor, but you should be able to run it more CrawlerRunner() deferred runner.crawl(spider)...

Run Scrapy code from Jupyter Notebook without issues

ReactorNotRestartable error can be mitigated using this package. In this blog post, I am showing the steps that I took to run scrapy...

How to overcome Scrapy - Reactor not Restartable - Reddit

If I need to search another term, I need to restart Jupyter Notebook. Otherwise, I will receive Reactor not restartable error.