v1.6 is slower than v1.5
See original GitHub issueπ Bug
1.6.0 is 5 times slower than 1.5.0
I used the same code and parameters on version 1.5.0 and version 1.6.0. 1.6.0 is 5 times slower than 1.5.0.
To Reproduce
1.6.0
> pip install pytorch-lightning==1.6.0
...
> python run.py
{'optimizers': {'lr': 0.001, 'weight_decay': 0}, 'train': {'milestones': [20, 40, 60, 80, 100, 120, 160, 180, 200, 220], 'gamma': 0.5, 'batch_size': 16}, 'general': {'save_dir': 'logs'}, 'trainer': {'gpus': [0], 'accelerator': 'gpu', 'max_epochs': 1, 'val_check_interval': 1, 'limit_train_batches': 0.01, 'limit_val_batches': 0.01, 'profiler': 'simple'}, 'data': {'cls_type': 'cat'}, 'exp_name': 'test'}
Global seed set to 1234
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1)` was configured so validation will run after every batch.
Missing logger folder: logs/lightning_logs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
-------------------------------------------------
0 | model | PvnetModelResnet18 | 13.0 M
1 | vote_crit | SmoothL1Loss | 0
2 | seg_crit | CrossEntropyLoss | 0
-------------------------------------------------
13.0 M Trainable params
0 Non-trainable params
13.0 M Total params
51.831 Total estimated model params size (MB)
loading annotations into memory...
Done (t=1.30s)
creating index...
index created!
Epoch 0: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 12/12 [00:25<00:00, 2.11s/it, loss=0.861, v_num=0, train_ver_loss=0.205, train_seg_loss=0.656]
FIT Profiler Report
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Action | Mean duration (s) | Num calls | Total time (s) | Percentage % |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Total | - | 574 | 29.691 | 100 % |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| run_training_epoch | 25.136 | 1 | 25.136 | 84.66 |
| run_training_batch | 1.7876 | 12 | 21.451 | 72.247 |
| [LightningModule]LitPvnet.optimizer_step | 1.787 | 12 | 21.444 | 72.225 |
| [Strategy]SingleDeviceStrategy.backward | 1.1394 | 12 | 13.673 | 46.05 |
| [Strategy]SingleDeviceStrategy.training_step | 0.6367 | 12 | 7.6404 | 25.733 |
| [Callback]ModelCheckpoint{'monitor': None, 'mode': 'min', 'every_n_train_steps': 0, 'every_n_epochs': 1, 'train_time_interval': None, 'save_on_train_epoch_end': True}.on_train_epoch_end | 0.23877 | 1 | 0.23877 | 0.80419 |
| on_train_batch_end | 0.0010752 | 12 | 0.012903 | 0.043458 |
1.5.0
> pip install pytorch-lightning==1.5.0
...
> python run.py
{'optimizers': {'lr': 0.001, 'weight_decay': 0}, 'train': {'milestones': [20, 40, 60, 80, 100, 120, 160, 180, 200, 220], 'gamma': 0.5, 'batch_size': 16}, 'general': {'save_dir': 'logs'}, 'trainer': {'gpus': [0], 'accelerator': 'gpu', 'max_epochs': 1, 'val_check_interval': 1, 'limit_train_batches': 0.01, 'limit_val_batches': 0.01, 'profiler': 'simple'}, 'data': {'cls_type': 'cat'}, 'exp_name': 'test'}
Global seed set to 1234
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
-------------------------------------------------
0 | model | PvnetModelResnet18 | 13.0 M
1 | vote_crit | SmoothL1Loss | 0
2 | seg_crit | CrossEntropyLoss | 0
-------------------------------------------------
13.0 M Trainable params
0 Non-trainable params
13.0 M Total params
51.831 Total estimated model params size (MB)
loading annotations into memory...
Done (t=1.37s)
creating index...
index created!
Epoch 0: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 12/12 [00:09<00:00, 1.23it/s, loss=0.861, v_num=0, train_ver_loss=0.205, train_seg_loss=0.656]
FIT Profiler Report
Action | Mean duration (s) |Num calls | Total time (s) | Percentage % |
--------------------------------------------------------------------------------------------------------------------------------------
Total | - |_ | 14.094 | 100 % |
--------------------------------------------------------------------------------------------------------------------------------------
run_training_epoch | 9.7956 |1 | 9.7956 | 69.5 |
run_training_batch | 0.44689 |12 | 5.3626 | 38.048 |
get_train_batch | 0.23755 |13 | 3.0882 | 21.911 |
fetch_next_train_batch | 0.23753 |13 | 3.0879 | 21.909 |
optimizer_step_with_closure_0 | 0.2286 |12 | 2.7432 | 19.463 |
training_step_and_backward | 0.2224 |12 | 2.6688 | 18.935 |
model_forward | 0.18607 |12 | 2.2329 | 15.842 |
training_step | 0.18589 |12 | 2.2306 | 15.827 |
backward | 0.035633 |12 | 0.4276 | 3.0338 |
on_train_epoch_end | 0.4268 |1 | 0.4268 | 3.0282 |
on_train_batch_end | 0.0063143 |12 | 0.075772 | 0.53761 |
Environment
- PyTorch Version: 1.8.1+cu111
- Python version: 3.8.13
- OS : Ubuntu
- CUDA/cuDNN version: 11.4
- How you installed PyTorch (
conda
,pip
, source): conda
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:10 (4 by maintainers)
Top Results From Across the Web
v1.6 is slower than v1.5 Β· Issue #12713 Β· Lightning-AI ... - GitHub
I used the same code and parameters on version 1.5.0 and version 1.6.0. 1.6.0 is 5 times slower than 1.5.0. To Reproduce. 1.6.0....
Read more >ePSXe v2 running much slower than v1.6 - NGEmu
Hello! ePSXe 2.0.5 is running very slow at my machine. I am using Video-plugin Pete's Open GL2 Driver 2.9, Desktop = 3840x2160,Β ...
Read more >Solved: Windows 10 Graphics within Guest after Upgrade 15
The Graphics within this workstations are unusable slow (takes > 10 seconds in the File Explorer). Downgrading to an 15 version was a...
Read more >PLC Sim V15 very slow - 198347 - Industry Support Siemens
But other versons of TIA portals are more faster than V15. 1. Go to My computer properties> System setting>Performance> Best performance.
Read more >CSDID Version 1.6 - Playing with Stata
So why was the older CSDID slow? As I mentioned before, CSDID works together with DRDID to obtain the best estimate for treatment...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @piraka9011! Sorry for the frustration caused.
We did not foresee the problems that could be caused by setting this flag by default. This was implemented in https://github.com/PyTorchLightning/pytorch-lightning/pull/11944 and mentioned in the βChangedβ section of the release.
I am leaning towards defaulting to what torch defaults to which is False, meaning, reverting this change.
Does anybody from @PyTorchLightning/core-lightning object to this? This βfixβ can be done by reviving this PR: https://github.com/PyTorchLightning/pytorch-lightning/pull/12020
Hi @piraka9011, thank you for the detailed information provided in the linked issue.
One possible cause is that we enabled
Trainer(benchmark=True)
by default from v1.6. It can lead to slower training if the input size varies frequently, and it may also lead to more memory consumption depending on the algorithms.refs:
Could you try disabling it (
Trainer(benchmark=False)
) and check the performance again?@flyinghu123 @piraka9011 Do you see the regression when running without GPUs, too?