Pytorch-lighning: W&B process failed to launch
See original GitHub issueDescription
I am trying to get weights and bias working with PyTorch-lightning. I didn’t know if this is an issue of lightning or Wandb. But I think it’s from wandb. When trying to start the run I get the error message:
wandb.run_manager.LaunchError: W&B process failed to launch, see: wandb\debug.log
Stacktrace:
D:\Programme\Anaconda3\envs\pytorch\python.exe C:/Users/schup/PycharmProjects/Workspace/pytorch_test/GAN/CNN_lightning.py
wandb: Tracking run with wandb version 0.8.29
Traceback (most recent call last):
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\internal_cli.py", line 106, in <module>
main()
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\internal_cli.py", line 98, in main
headless(args)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\internal_cli.py", line 54, in headless
util.sentry_reraise(e)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\util.py", line 94, in sentry_reraise
six.reraise(type(exc), exc, sys.exc_info()[2])
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\six.py", line 703, in reraise
raise value
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\internal_cli.py", line 52, in headless
user_process_pid, stdout_master_fd, stderr_master_fd)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\run_manager.py", line 1137, in wrap_existing_process
stderr_read_file = os.fdopen(stderr_read_fd, 'rb')
File "D:\Programme\Anaconda3\envs\pytorch\lib\os.py", line 1027, in fdopen
return io.open(fd, *args, **kwargs)
OSError: [WinError 6] The handle is invalid
wandb: ERROR W&B process (PID 15916) did not respond
wandb: ERROR W&B process failed to launch, see: wandb\debug.log
Traceback (most recent call last):
File "C:/Users/schup/PycharmProjects/Workspace/pytorch_test/GAN/CNN_lightning.py", line 132, in <module>
trainer.fit(net)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 630, in fit
self.run_pretrain_routine(model)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 748, in run_pretrain_routine
self.logger.log_hyperparams(ref_model.hparams)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\pytorch_lightning\loggers\base.py", line 18, in wrapped_fn
fn(self, *args, **kwargs)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\pytorch_lightning\loggers\wandb.py", line 96, in log_hyperparams
self.experiment.config.update(params)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\pytorch_lightning\loggers\wandb.py", line 87, in experiment
id=self._id, resume='allow', tags=self._tags, entity=self._entity)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\__init__.py", line 1088, in init
_init_headless(run)
File "D:\Programme\Anaconda3\envs\pytorch\lib\site-packages\wandb\__init__.py", line 304, in _init_headless
"W&B process failed to launch, see: {}".format(path))
wandb.run_manager.LaunchError: W&B process failed to launch, see: wandb\debug.log
Debug log:
2020-03-07 18:47:34,666 DEBUG MainThread:15916 [wandb_config.py:_load_defaults():119] no defaults not found in config-defaults.yaml
2020-03-07 18:47:34,723 DEBUG MainThread:15916 [util.py:is_cygwin_git():314] Failed checking if running in CYGWIN due to: FileNotFoundError(2, 'The system cannot find the file specified', None, 2, None)
2020-03-07 18:47:34,725 DEBUG MainThread:15916 [git_repo.py:repo():30] git repository is invalid
2020-03-07 18:47:34,742 DEBUG MainThread:15916 [meta.py:setup():97] code probe starting
2020-03-07 18:47:34,742 DEBUG MainThread:15916 [meta.py:setup():101] non time limited probe of code
2020-03-07 18:47:34,743 DEBUG MainThread:15916 [meta.py:_setup_code_program():58] save program starting
2020-03-07 18:47:34,743 DEBUG MainThread:15916 [meta.py:_setup_code_program():60] save program starting: C:/Users/schup/PycharmProjects/Workspace/pytorch_test/GAN/CNN_lightning.py
2020-03-07 18:47:34,743 DEBUG MainThread:15916 [meta.py:_setup_code_program():65] save program saved: C:\Users\schup\PycharmProjects\Workspace\pytorch_test\GAN\wandb\run-20200307_174733-wcp2iba7\code\CNN_lightning.py
2020-03-07 18:47:34,744 DEBUG MainThread:15916 [meta.py:_setup_code_program():67] save program
2020-03-07 18:47:34,744 DEBUG MainThread:15916 [meta.py:setup():119] code probe done
2020-03-07 18:47:34,753 DEBUG MainThread:15916 [run_manager.py:__init__():541] Initialized sync for None/wcp2iba7
2020-03-07 18:47:34,758 DEBUG MainThread:15916 [connectionpool.py:_new_conn():959] Starting new HTTPS connection (1): api.wandb.ai:443
2020-03-07 18:47:34,951 DEBUG MainThread:15916 [connectionpool.py:_make_request():437] https://api.wandb.ai:443 "POST /graphql HTTP/1.1" 200 None
2020-03-07 18:47:34,956 DEBUG raven-sentry.BackgroundWorker:15916 [connectionpool.py:_new_conn():959] Starting new HTTPS connection (1): sentry.io:443
What I Did
if __name__ == "__main__":
net = CNN()
wd_logger = loggers.WandbLogger(name="test")
trainer = pl.Trainer(logger=wd_logger)
trainer.fit(net)
trainer.test()
My Enviroment
PyTorch version: 1.4.0 OS: Microsoft Windows 10 Pro Python version: 3.7 Is CUDA available: Yes Wandb 0.8.29
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (5 by maintainers)
Top Results From Across the Web
Pytorch-lighning: W&B process failed to launch #905 - GitHub
Description I am trying to get weights and bias working with PyTorch-lightning. I didn't know if this is an issue of lightning or...
Read more >Trainer — PyTorch Lightning 1.8.5.post0 documentation
Once you're done training, feel free to run the test set! ... may fail if other process is occupying) trainer = Trainer(accelerator="gpu", devices=2, ......
Read more >Pytorch-lightning strange error: implemented method that ...
I am trying to use the library Pytorch Lightning, but I am encountering a strange error that I am not able to solve....
Read more >PyTorch Lightning
Run your code on any hardware; Performance & bottleneck profiler; Model checkpointing; 16-bit precision; Run distributed training.
Read more >torchrun (Elastic Launch) — PyTorch 1.13 documentation
distributed.launch to torchrun . To take advantage of new features such as elasticity, fault-tolerance, and error reporting of ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hey @LuposX In the past year we’ve majorly reworked the CLI and UI for Weights & Biases. We’re closing issues older than 6 months. Please comment to reopen.
@borisdayma this should be a minimal example of an MNIST classifier: