question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: when running VarNet

See original GitHub issue

I get the following ValueError when I attempt to run the VarNet. Any idea why? I am using the NYU multi-coil knee dataset but just limited (10 training h5py files). I have in my environment pytorch-lightning 0.6.0 and torch 1.3.1 with torchvision 0.4.2.

This is what I am using to train:

python models/varnet/varnet.py --resolution 320 --mode train --challenge multicoil --exp var_net --mask-type random --data-path /media/iva19/multicoil_train/

and that’s the error:

INFO:root:gpu available: True, used: True
INFO:root:VISIBLE GPUS: 0
Traceback (most recent call last):
  File "models/varnet/varnet.py", line 374, in <module>
    main()
  File "models/varnet/varnet.py", line 371, in main
    run(args)
  File "models/varnet/varnet.py", line 342, in run
    trainer.fit(model)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 687, in fit
    mp.spawn(self.ddp_train, nprocs=self.num_gpus, args=(model,))
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 331, in ddp_train
    self.run_pretrain_routine(model)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 757, in run_pretrain_routine
    self.logger.log_hyperparams(ref_model.hparams)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/pytorch_lightning/logging/base.py", line 14, in wrapped_fn
    fn(self, *args, **kwargs)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/pytorch_lightning/logging/tensorboard.py", line 88, in log_hyperparams
    self.experiment.add_hparams(hparam_dict=params, metric_dict={})
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/torch/utils/tensorboard/writer.py", line 292, in add_hparams
    exp, ssi, sei = hparams(hparam_dict, metric_dict)
  File "/home/iva19/usr/local/miniconda3/envs/fastMRI/lib/python3.6/site-packages/torch/utils/tensorboard/summary.py", line 156, in hparams
    raise ValueError('value should be one of int, float, str, bool, or torch.Tensor')
ValueError: value should be one of int, float, str, bool, or torch.Tensor

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
mmuckleycommented, Jun 25, 2020

I saw that as well. This model is pretty heavy on memory, also on my 16 GB GPU. Perhaps they prototyped it on a 32 GB GPU.

I was able to get past this error by decreasing the size of the model - e.g., --num-cascades 4.

2reactions
mmuckleycommented, Jun 25, 2020

Great. I’m going to be going through the repository soon and trying to clean up a few things, including requirements.txt.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can anyone help and tell me why pd.to_numeric is throwing a ...
Your StackTrace ends with: ValueError: Unable to parse string "-" at position 15264. So probably Average_price column contains somewhere ...
Read more >
Automation Scripts - Gravwell Documentation
Gravwell provides a robust scripting engine in which you can run searches, ... setResource(name, value) error creates (if necessary) and updates a resource ......
Read more >
Source code for moabb.datasets.braininvaders - NeuroTechX
Created using Sphinx 3.5.4.
Read more >
Network architectures — MONAI 1.1.0 Documentation
The Global Convolutional Network module using large 1D Kx1 and 1xK kernels to represent ... ValueError – When r is nonpositive or larger...
Read more >
Read Free Ranger Field Manual Pdf Free Copy - vcs
python valueerror cannot switch from manual field ... one or more game masters using a 3d real time editor they can.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found