Processes stopped when passing large objects to function to be parallelized
See original GitHub issueProblem:
Apply a NLP Deep Learning model for Text Geneartion over the rows of a Pandas Series. The function call is:
out = text_column.parallel_apply(lambda x: generate_text(args, model, tokenizer, x))
where args
, tokenizer
are light objects but model
is a heavy object, storing a Pytorch model which weighs more than 6GB on secondary memory and takes up ~12GB RAM when running it.
I have been doing some tests and the problem arises only when I pass the heavy model to the function (even without effectively running it inside the function), so it seems that the problem is passing an object as argument that takes up a lot of memory. (Maybe related with the Sharing Memory strategy for parallel computing.)
After running the parallel_apply
the output I get is:
INFO: Pandarallel will run on 8 workers.
INFO: Pandarallel will use standard multiprocessing data tranfer (pipe) to transfer data between the main process and workers.
0.00% | 0 / 552 |
0.00% | 0 / 552 |
0.00% | 0 / 551 |
0.00% | 0 / 551 |
0.00% | 0 / 551 |
0.00% | 0 / 551 |
0.00% | 0 / 551 |
0.00% | 0 / 551 |
And it gets stuck there forever. Indeed, there are two processed spawned and both are stopped:
ablanco+ 85448 0.0 4.9 17900532 12936684 pts/27 Sl 14:41 0:00 python3 text_generation.py --input_file input.csv --model_type gpt2 --output_file out.csv --no_cuda --n_cpu 8
ablanco+ 85229 21.4 21.6 61774336 57023740 pts/27 Sl 14:39 2:26 python3 text_generation.py --input_file input.csv --model_type gpt2 --output_file out.csv --no_cuda --n_cpu 8
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (1 by maintainers)
Currently fixed by upgrading python to 3.7.6 from 3.7.4, apparently the problem was with pickle.
For those who seek why a single process is running indefinitely with no results: I was on 3.6.4 and upgrading to 3.7.6 fixed the issue. Still no luck with progress bars, sadly.