question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Jupyter run error ReactorNotRestartable

See original GitHub issue

I am able to run Scrapy in a Jupyter notebook. The first time it works fine.
However any subsequent attempts will fail with errors below.

To get it working again I must restart the python kernel. It looks like a problem of starting a reactor when one is already up and running.

from scrapy.crawler import CrawlerProcess
 
process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogPostSpider)
process.start() # the script will block here until the crawling is finished

---------------------------------------------------------------------------
ReactorNotRestartable                     Traceback (most recent call last)
<ipython-input-8-6d30d0f44f41> in <module>()
     13 })
     14 process.crawl(spider)
---> 15 process.start() # the script will block here until the crawling is finished
     16 
     17 

/opt/conda/lib/python3.5/site-packages/scrapy/crawler.py in start(self, stop_after_crawl)
    278         tp.adjustPoolsize(maxthreads=self.settings.getint('REACTOR_THREADPOOL_MAXSIZE'))
    279         reactor.addSystemEventTrigger('before', 'shutdown', self.stop)
--> 280         reactor.run(installSignalHandlers=False)  # blocking call
    281 
    282     def _get_dns_resolver(self):

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in run(self, installSignalHandlers)
   1240 
   1241     def run(self, installSignalHandlers=True):
-> 1242         self.startRunning(installSignalHandlers=installSignalHandlers)
   1243         self.mainLoop()
   1244 

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self, installSignalHandlers)
   1220         """
   1221         self._installSignalHandlers = installSignalHandlers
-> 1222         ReactorBase.startRunning(self)
   1223 
   1224 

/opt/conda/lib/python3.5/site-packages/twisted/internet/base.py in startRunning(self)
    728             raise error.ReactorAlreadyRunning()
    729         if self._startedBefore:
--> 730             raise error.ReactorNotRestartable()
    731         self._started = True
    732         self._stopped = False

ReactorNotRestartable: 

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:4
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

8reactions
Gallaeciocommented, Jul 8, 2019

Maybe we should write a documentation page on how to use Scrapy with Jupyter Notebook for now.

1reaction
naretocommented, Aug 6, 2020

has any work been done on this? How am I supposed to develop a scrapy class in a notebook if for each modification I have to restart the kernel?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scrapy: fail to re-run in Jupyter Notebook script, reporting ...
The problem is that, after I run the above script, I can't run it again. Jupyter notebook returns the error ReactorNotRestartable.
Read more >
ReactorNotRestartable error while connecting to the websocket
It seems to be some issue with your Anaconda environment. Websocket code is working fine in our Linux Anaconda environment on Jupyter Notebook....
Read more >
Scrapy: Fail To Re-Run In Jupyter Notebook Script ... - ADocLib
I am getting the following error: You cannot restart the reactor, but you should be able to run it more CrawlerRunner() deferred runner.crawl(spider)...
Read more >
Run Scrapy code from Jupyter Notebook without issues
ReactorNotRestartable error can be mitigated using this package. In this blog post, I am showing the steps that I took to run scrapy...
Read more >
How to overcome Scrapy - Reactor not Restartable - Reddit
If I need to search another term, I need to restart Jupyter Notebook. Otherwise, I will receive Reactor not restartable error.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found