Training results(IS and FID) are not good as yours with same training process
See original GitHub issueHi ajbrock,
I was running the training code on ImageNet by using default script launch_BigGAN_bs256x8.sh
. It has finished 134k iterations and here is the log file.
Compare with the log file that you released, I got the worse results. I kept all the parameters as same as your default settings. The training is on 8xV100. Do you have any suggestion to make it better? Or what should I check to get a similar result as yours?
Thanks a lot!
Issue Analytics
- State:
- Created 4 years ago
- Comments:14 (4 by maintainers)
Top Results From Across the Web
The Model Performance Mismatch Problem (and what to do ...
The procedure when evaluating machine learning models is to fit and evaluate them on training data, then verify that the model has good...
Read more >Training Effectiveness | Training Development - CDC
Training effectiveness refers to how well your training supports learning and learning transfer. There are many ways to evaluate training ...
Read more >Test accuracy higher than training. How to interpret?
I think a first step is to check whether the reported training and test performance are in fact correct. Is the splitting during...
Read more >How to Measure Training Effectiveness in 2023? | Simplilearn
Know how to measure training effectiveness and learn proven methods to ensure that the original learning goals and purpose are achieved.
Read more >How to Evaluate GANs using Frechet Inception Distance (FID)
In image classification, you use the model checkpoint, which gives the best validation accuracy. However, that is not the case for GANs. Training...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi Qi,
There can be a substantial amount of variance in the time to convergence for a model (I only had time to train one with this codebase as I don’t have unfettered access to that kind of compute) so it’s not surprising that yours might need longer to converge/collapse–it appears to still be training.
I’d say let it run and see what IS/FID it gets to when it explodes and dies. This would also be a helpful datapoint for this repo to start getting a better sense of the variance in #itrs required =); if you wouldn’t mind posting the full logfile (e.g. in a pastebin) I can take a look at them and check for any anomalies.
hi guys, how to handle the above issues? do you reproduce the released results.