Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

help with training

See original GitHub issue

Thanks for the awesome code! I am training my own model right now and have a few questions:

currently I am using 100k (out of around 1.8m) images from CelebAMask-HQ, ffhq and vggface to train the model. did you use the full set to train your model?
I didn’t see large improvement for most losses anymore (160k steps trained, 4gpus x 12images/batch); is this normal? should I just continue training for more steps?
I also checked the validation results, and the reconstruction is not good.
I noticed shuffle for the training dataloader is not set to True, did you use the same setting?

Thanks!

Issue Analytics

State:
Created 3 years ago
Comments:28

Top GitHub Comments

2reactions

niuyuanccommented, Nov 19, 2022

@antonsanchez Could you please share the pretrained weights? Thank you so much 🙏🙏🙏 My email: niuyuanc@163.com Thanks🙏🙏🙏

2reactions

y-x-ccommented, Nov 9, 2020

Hi! You did very fast training!

Yes, I used full-set dataset. I don’t know about IJB-C dataset. The distribution of dataset can influence to your model.

In the paper, they trained for 500K steps. I trained for over 500K. In my eye, your losses are getting down for attribute_loss but unstable for Rec and ID loss. In my case, the two losses are more stable and lower at the same steps.

shuffle option in training dataloader should be True. It is clearly my mistake while publishing.

Thanks for your reply.

I just corrected the description, I am using the same datasets (CelebAMask-HQ, ffhq and vggface) as well.

So in your case, each step has 64 images; and let’s say there are 1.5m images in those three datasets, so you trained for around 4 epochs (= 64 * 500000 / 1500000 / 5 ) in total?
In my case, each step only has 48 images, so maybe that’s why the two losses are higher at the same steps.
I found the Rec loss is going much lower in the third epoch, and the results are much better than before. I will continue my current training and see what’s going on.

Thanks for the clarification, I also changed to True during my training.

Top Results From Across the Web

Find money for training | CareerOneStop

You may be eligible for programs that help cover the cost of training or retraining. Find out more on CareerOneStop.

The Importance of Training Employees: 11 Benefits | Indeed.com

One of the best ways to enhance knowledge and skills is through training. Providing employees with relevant and consistent training can help ......

Types of Training/Education - HRA - NYC.gov

The NYC Training Guide provides important information about training classes, including location, cost, and reviews from students. Learn how to use the guide....

Adult Training Programs - U.S. Department of Labor

For a list of programs nearest you, contact an American Job Center or call ETA's toll-free help line at 1-877-US-2JOBS (TTY: 1-877-889-5267).

HELP 0-3 Training - self-paced Module (UK)

The University of Kentucky (UK) offers a self-paced online training module on HELP® 0-3. Click here to go to the UK Registration page...