Right set of UNet hyperparameters when training DDPM
See original GitHub issueHi there ! I am currently training a DDPM model on a custom image dataset following the cool unconditional_image_generation example script.
Since I don’t have the compute to perform comprehensive hyperparameter tuning of my architecture, I was wondering if there are any common intuitions when designing the UNet
denoiser : width/length of the residual blocks, number and positions of the attention blocks, etc. with respect to the number of samples in the training set as well as their resolution.
If anyone has a wide experience in training DMs, it would be super cool to share insights here or in a dedicated blog post such as the one discussing the hyperparameters choice when training Dreambooth.
Thank you ! 🤗
Issue Analytics
- State:
- Created 10 months ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Parameters, Hyperparameters, Machine Learning
Therefore, setting the right hyperparameter values is very important because it directly impacts the performance of the model that will result from them...
Read more >Trained Model in Supervised Deep Learning is a Conditional ...
We proved that a trained model in supervised deep learning minimizes the condi- tional risk for each input (Theorem 2.1).
Read more >How diffusion models work: the math from scratch - AI Summer
We will focus on the most prominent one, which is the Denoising Diffusion Probabilistic Models (DDPM) as initialized by Sohl-Dickstein et al ...
Read more >Keras Tuner: Lessons Learned From Tuning Hyperparameters ...
To select the right set of hyperparameters, we do hyperparameter tuning. Even though tuning might be time- and CPU-consuming, the end result pays...
Read more >Léopold Maillard leopoldmaillard - GitHub
Right set of UNet hyperparameters when training DDPM. Hi there ! I am currently training a DDPM model on a custom image dataset...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Maybe cc @anton-l ?
Hello @anton-l, thank you for your insight !
I also found out that Dhariwal & Nichol discussed hyperparameters tuning of DDPM in their paper Diffusion Models Beat GANs on Image Synthesis.
Will close this for now !