partial pipeline with parallel runner failing
See original GitHub issueI have two pipelines running one after the other, it runs happily through with ParallelRunner specified. When I try and run the second pipeline, it crashes with parallel runner but not when run sequentially. The only inputs to second pipeline are a dataframe stored on disk as parquet
and the parameters.yaml, it outputs some
pkl` files.
kedro run --runner=ParallelRunner # works
kedro run --runner=ParallelRunner --pipeline ml # doesnt work; ml runs after etl
kedro run --pipeline ml # works
I am really at a loss as to what could be causing this behaviour, any insights would be appreciated.
The only traceback I get is this:
Traceback (most recent call last): File "/usr/local/var/pyenv/versions/test-ml/bin/kedro", line 8, in <module> sys.exit(main()) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/framework/cli/cli.py", line 266, in main cli_collection() File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/framework/cli/cli.py", line 211, in main super().main( File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/framework/cli/project.py", line 408, in run session.run( File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/framework/session/session.py", line 414, in run run_result = runner.run(filtered_pipeline, catalog, run_id) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/runner/runner.py", line 106, in run self._run(pipeline, catalog, run_id) File "/usr/local/var/pyenv/versions/3.8.9/envs/test-ml/lib/python3.8/site-packages/kedro/runner/parallel_runner.py", line 354, in _run node = future.result() File "/usr/local/var/pyenv/versions/3.8.9/lib/python3.8/concurrent/futures/_base.py", line 437, in result return self.__get_result() File "/usr/local/var/pyenv/versions/3.8.9/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
kedro 0.17.5 python 3.8.9
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (6 by maintainers)
thanks, it seems setting
n_jobs=1
fixed this, was doing it for other models but not the XGBoost for some reason. I will close the issue, but still curious why this only appeared as an issue when running the second pipeline.My guess would be something like XGBoost couldnt fork anymore after running the ETL pipeline for whatever reason so it defaulted to single job.
Okay good - now it should be noted that in some cases be more efficient to use the XGBoost parallelisation instead of Kedro’s so you are always able to parallelise on a CLI level if they are independent.
kedro run --pipeline a --parallel & kedro run --pipeline b --runner=SequentialRunner --params="n_jobs:32"
&
will run both at the same time (use&&
if you want to runa
thenb
)n_jobs
at runtime.--runner=SequentialRunner
is only here to be explicit, it is assumed if not provided