"AttributeError: 'DataRouter' object has no attribute 'pool' " while running the HTTP server on a machine without multiprocessing support
See original GitHub issueRasa NLU version: 0.13.8
Operating system (windows, osx, …): Ubuntu 14.04
Content of model configuration file:
language: "en"
pipeline:
- name: "tokenizer_whitespace"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "ner_duckling"
dimensions: ["time"]
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
intent_tokenization_flag: true
intent_split_symbol: "+"
Issue:
On trying to start the rasa_nlu
HTTP server, I am getting the following error.
(bot_env) [ps597689]$ python -m rasa_nlu.server --path nlu_models/
Traceback (most recent call last):
File "/home/<username>/opt/python-3.6.2/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/<username>/opt/python-3.6.2/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/<username>/bot.<domain>/bot_env/lib/python3.6/site-packages/rasa_nlu/server.py", line 438, in <module>
wait_time_between_pulls=cmdline_args.wait_time_between_pulls
File "/home/<username>/bot.<domain>/bot_env/lib/python3.6/site-packages/rasa_nlu/data_router.py", line 119, in __init__
self.pool = ProcessPool(self._training_processes)
File "/home/<username>/opt/python-3.6.2/lib/python3.6/concurrent/futures/process.py", line 390, in __init__
EXTRA_QUEUED_CALLS)
File "/home/<username>/opt/python-3.6.2/lib/python3.6/multiprocessing/context.py", line 102, in Queue
return Queue(maxsize, ctx=self.get_context())
File "/home/<username>/opt/python-3.6.2/lib/python3.6/multiprocessing/queues.py", line 42, in __init__
self._rlock = ctx.Lock()
File "/home/<username>/opt/python-3.6.2/lib/python3.6/multiprocessing/context.py", line 67, in Lock
return Lock(ctx=self.get_context())
File "/home/<username>/opt/python-3.6.2/lib/python3.6/multiprocessing/synchronize.py", line 163, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/home/<username>/opt/python-3.6.2/lib/python3.6/multiprocessing/synchronize.py", line 60, in __init__
unlink_now)
OSError: [Errno 38] Function not implemented
Exception ignored in: <bound method DataRouter.__del__ of <rasa_nlu.data_router.DataRouter object at 0x7f2a37903c18>>
Traceback (most recent call last):
File "/home/<username>/bot.<domain>/bot_env/lib/python3.6/site-packages/rasa_nlu/data_router.py", line 123, in __del__
self.pool.shutdown()
AttributeError: 'DataRouter' object has no attribute 'pool'
This error only occurs when I try to deploy the code on DreamHost VPS. The OSError: [Errno 38] Function not implemented
is being thrown because DreamHost VPS doesn’t allow multiprocessing apparently (I tried running a simple multiprocessing code and it throws the same error).
However imo, ideally rasa_nlu
should catch this and either:-
- Revert to doing the same job without using multiprocessing OR
- Give the user a more informative error message
Although I am not very experienced at Multiprocessing, Thread synchronization etc. I can try and write a PR to fix it if I am given some outline on what to do.
Issue Analytics
- State:
- Created 5 years ago
- Comments:14 (13 by maintainers)
Top Results From Across the Web
AttributeError: 'Pool' object has no attribute '__exit__'
The documentation says that multiprocessing.pool supports the context management protocol ( with statements) in Python version 3.3 and above.
Read more >multiprocessing — Process-based parallelism — Python 3.11 ...
It has methods which allows tasks to be offloaded to the worker processes in a few different ways. For example: from multiprocessing import...
Read more >Rasa Open Source Change Log
All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning starting with version 1.0.
Read more >Exception Handling in Methods of the Multiprocessing Pool ...
When working with big data, it is often necessary to parallelize calculations. In python, the standard multiprocessing module is usually used for tasks...
Read more >Communication Between Processes - Python ... - PyMOTW
As with threads, a common use pattern for multiple processes is to divide a job up among several workers to run in parallel....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @parthsharma1996 , it seems that sklearn uses multiprocessing and there doesn’t seem to be a way to switch that off. Feel free to have another look into sklearn, but if it doesn’t work I’m afraid we can’t switch off multiprocessing in Rasa NLU and sklearn-crf. Sorry!
Sure, so add a keyword argument to
DataRouter
calledsingle_process
, which isNone
by default. If it is set toTrue
, do not initialiseself.pool
. Instart_train_process()
. Add a check for this flag toif six.PY2 and self._tf_in_pipeline(train_config)
, so that should become something likeand you would have to update the
logger.warning
method that follows it. Make thesingle-process
kwarg inDataRouter
flag accessible inrasa_nlu.server
by adding--single-process
to the argument parser. Does that make sense?