parallelization results in signal(11): Segmentation fault
See original GitHub issueI am suffering from a segmentation fault 11 error when multithreading using joblib.Parallel to run a small Julia function.
I import a function in Python, modeled after this attempted solution
from julia.api import Julia
jl = Julia(compiled_modules=False)
from julia import Main
Main.include("fastsum.jl")
from julia.Main import greenfunction
where the function itself is
using Tullio, LoopVectorization
function greenfunction(mu, wns, sigwns, energy, dosnorm)
new_array = Array{ComplexF64}(undef, length(wns))
@tullio threads=false new_array[i] = 1 / (mu + wns[i] * 1im - energy[j] - sigwns[i]) * dosnorm[j]
dosnorm = Nothing
energy = Nothing
sigwns = Nothing
wns = Nothing
return new_array
end
Clearing some of the variables seems to make the number of processes I can run increase from something around 10 to 80, but eventually it still segfaults, with error as
signal (15): Terminated
in expression starting at none:0
mul_fast at ./fastmath.jl:167 [inlined]
mul_fast at ./fastmath.jl:219 [inlined]
...
... ( a lot of PyCall directories)
...
unknown function (ip: (nil))
Allocations: 76861098 (Pool: 76845028; Big: 16070); GC: 116
signal (11): Segmentation fault
in expression starting at none:0
...
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The more detailed errors are here
Is there a way to clear the Julia cache as this seems to be the problem? Or import Julia less frequently somehow? I call that import function several hundred times.
I also attempted
from julia import Main
greenfunction = Main.eval("""
using Tullio, LoopVectorization
function greenfunction(mu, wns, sigwns, energy, dosnorm)
new_array = Array{ComplexF64}(undef, length(wns))
@tullio threads=false new_array[i] = 1 / (mu + wns[i] * 1im - energy[j] - sigwns[i]) * dosnorm[j]
dosnorm = Nothing
energy = Nothing
sigwns = Nothing
wns = Nothing
return new_array
end
""")
but that gives the same error.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Segmentation fault 11: C with MPI - Stack Overflow
The program hangs if you run it on more than 2 processes because the sends and receives are hard-coded to only communicate between...
Read more >Thread: [BUGS] signal 11 segfaults with parallel workers
Core was generated by `postgres: bgworker: parallel worker f'. Program terminated with signal SIGSEGV, Segmentation fault. #0 MemoryContextAlloc ...
Read more >mpirun problems: exited on signal 11 (segmentation fault)
I installed OpenFOAM-1.6.x and something strange happened. If I launch a parallel running: Code: foamJob -p -s simpleFoam I obtain Code: ...
Read more >Segfault testing parallel HDF5 with Intel MPI
I get a segmentation fault when running the test suite for parallel HDF5 (1.10.8) when compiling with oneAPI 2021 update 4.
Read more >signal 11 segfaults with parallel workers - PostgreSQL
almost daily, with a signal 11 seg fault on a query as the triggering event: 2017-07-11 23:00:29.984 UTC LOG: worker process: parallel ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I am having similar issues. I am trying to use pyjulia in a flask web server but it just keeps segfaulting as I assume the webserver is multithreaded.
What
subprocess.Popen
uses is not very clear from the documentation. Somebody has to dig into the internal ofsubprocess
.