question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Cannot re-initialize CUDA in forked subprocess." error when running "transformers/notebooks/05-benchmark.ipynb" notebook

See original GitHub issue

Environment info

I am getting this error on a server, but also on Collab, so giving the Collab specs:

  • transformers version: 3.3.1
  • Platform: Linux-4.19.112±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.6.0+cu101 (True)
  • Tensorflow version (GPU?): 2.3.0 (True)
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No?

And on my server:

  • transformers version: 3.3.1
  • Platform: Linux-4.19.0-11-amd64-x86_64-with-debian-10.6
  • Python version: 3.6.12
  • PyTorch version (GPU?): 1.4.0 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No?

Who can help

@patrickvonplaten

Information

Model I am using (Bert, XLNet …): Any model.

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Run the colab on benchmarking provided on the transformers GitHub.
2020-10-15 14:20:38.078717: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
1 / 5
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Traceback (most recent call last):
  File "run_benchmark.py", line 47, in <module>
    main()
  File "run_benchmark.py", line 43, in main
    benchmark.run()
  File "/usr/local/lib/python3.6/dist-packages/transformers/benchmark/benchmark_utils.py", line 674, in run
    memory, inference_summary = self.inference_memory(model_name, batch_size, sequence_length)
ValueError: too many values to unpack (expected 2)

Expected behavior

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
dhkim0225commented, Nov 18, 2020

Try this. You can check other parameters for PyTorchBenchmarkArguments at here.

# main.py
def main():
    from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments

    args = PyTorchBenchmarkArguments(models=["bert-base-uncased"],
                                     batch_sizes=[8],
                                     sequence_lengths=[8, 32, 128, 512],
                                     multi_process=False)
    print(args.do_multi_processing)
    benchmark = PyTorchBenchmark(args)
    results = benchmark.run()
    print(results)


if __name__ == '__main__':
    main()
CUDA_VISIBLE_DEVICES=0 python main.py
0reactions
github-actions[bot]commented, Mar 6, 2021

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot re-initialize CUDA in forked subprocess #40403 - GitHub
Bug I am getting CUDA re-initialize error. I am using below code to generate synthetic dataset on GPU. To perform distributed training I...
Read more >
Cannot re-initialize CUDA in forked subprocess. Gunicorn + ...
I used spawn method in torch.multiprocessing as mentioned in error but problem still persists. init.py import torch torch.multiprocessing.
Read more >
Cannot re-initialize CUDA in forked subprocess" Displayed in ...
When PyTorch is used to start multiple processes, the following error message is displayed:RuntimeError: Cannot re-initialize CUDA in forked subprocessThe ...
Read more >
Cannot re-initialize CUDA in forked subprocess on network.to ...
I am trying to implement the DistributedDataParallel class in my training code. The training code is a block in a larger block that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found