"Cannot re-initialize CUDA in forked subprocess." error when running "transformers/notebooks/05-benchmark.ipynb" notebook
See original GitHub issueEnvironment info
I am getting this error on a server, but also on Collab, so giving the Collab specs:
transformers
version: 3.3.1- Platform: Linux-4.19.112±x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- PyTorch version (GPU?): 1.6.0+cu101 (True)
- Tensorflow version (GPU?): 2.3.0 (True)
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No?
And on my server:
transformers
version: 3.3.1- Platform: Linux-4.19.0-11-amd64-x86_64-with-debian-10.6
- Python version: 3.6.12
- PyTorch version (GPU?): 1.4.0 (True)
- Tensorflow version (GPU?): not installed (NA)
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No?
Who can help
Information
Model I am using (Bert, XLNet …): Any model.
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
2020-10-15 14:20:38.078717: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
1 / 5
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Traceback (most recent call last):
File "run_benchmark.py", line 47, in <module>
main()
File "run_benchmark.py", line 43, in main
benchmark.run()
File "/usr/local/lib/python3.6/dist-packages/transformers/benchmark/benchmark_utils.py", line 674, in run
memory, inference_summary = self.inference_memory(model_name, batch_size, sequence_length)
ValueError: too many values to unpack (expected 2)
Expected behavior
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Cannot re-initialize CUDA in forked subprocess #40403 - GitHub
Bug I am getting CUDA re-initialize error. I am using below code to generate synthetic dataset on GPU. To perform distributed training I...
Read more >Cannot re-initialize CUDA in forked subprocess. Gunicorn + ...
I used spawn method in torch.multiprocessing as mentioned in error but problem still persists. init.py import torch torch.multiprocessing.
Read more >Cannot re-initialize CUDA in forked subprocess" Displayed in ...
When PyTorch is used to start multiple processes, the following error message is displayed:RuntimeError: Cannot re-initialize CUDA in forked subprocessThe ...
Read more >Cannot re-initialize CUDA in forked subprocess on network.to ...
I am trying to implement the DistributedDataParallel class in my training code. The training code is a block in a larger block that...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Try this. You can check other parameters for PyTorchBenchmarkArguments at here.
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.
If you think this still needs to be addressed please comment on this thread.