IterableDataset Issue, OverflowError: cannot convert float infinity to integer
See original GitHub issueHey all,
I am very new to ML and PyTorch and PyTorch Lightning, so if this is a simple problem sorry to bother.
However, I am struggling to switch from PyTorch to PyTorch Lightning. The PyTorch code runs with no error on Google Colab hence I think that the structure is fine.
Now I am trying to implement Lightning following: https://towardsdatascience.com/from-pytorch-to-pytorch-lightning-a-gentle-introduction-b371b7caaf09
These are the link to my codes:
https://github.com/ykukkim/MLevent/blob/master/final_model.py, PyTorch https://github.com/ykukkim/MLevent/blob/master/lightningtest.py PyTorch Lightning
However, I get the following error and it seems that this is to do with IterableDataset. As my datasets are imbalance, meaning that I do not have a constant length of the dataset as well as there are more 0’s than 1’s, approximately 100:1, hence I need to penalise the 0’s by multiplying it with an arbitrary number.
I guess that this issue has been raised a few times, and I am not too sure whether there is a general fix or I have to play around with my dataset.
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/pydevd.py", line 1434, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/YKK/Documents/GitHub/mlevent/lightningtest.py", line 201, in <module>
main(hparams)
File "/Users/YKK/Documents/GitHub/mlevent/lightningtest.py", line 182, in main
trainer.fit(model)
File "/Users/YKK/anaconda3/envs/LMBTrain/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 707, in fit
self.run_pretrain_routine(model)
File "/Users/YKK/anaconda3/envs/LMBTrain/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in run_pretrain_routine
self.get_dataloaders(ref_model)
File "/Users/YKK/anaconda3/envs/LMBTrain/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py", line 200, in get_dataloaders
self.init_train_dataloader(model)
File "/Users/YKK/anaconda3/envs/LMBTrain/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py", line 79, in init_train_dataloader
self.val_check_batch = int(self.num_training_batches * self.val_check_interval)
**OverflowError: cannot convert float infinity to integer**
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:6 (3 by maintainers)
Thanks this works!
trainer = pl.Trainer(val_check_interval = 100,gpus=None)
I did the val_check interval, but I am not too sure what it does? Could you care to explain this to me by any chance?
Thank you!
Leaving notes for others who find this:
I ran into the following error:
I found this issue, and am indeed using an IterableDataset.
I specified both
limit_train_batches
andlimit_val_batches
In the Trainer config, and now I get traditional train/validation epochs. E.g.: