question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dreambooth: No checkpoints created and loop

See original GitHub issue

Hi,

I tried to train a model and despite having selected the checkpoints option none has been created at my Google Drive root folder. Also it didn´t created the final CKPT, and the notebook started again creating automatically a second one: https://imgur.com/a/0b8aWVG

My config was like this:

Start DreamBooth
Resume_Training:

If you're not satisfied with the result, check this box, run again the cell and it will continue training the current model.
Training_Steps:
7500
Total Steps = Number of Instance images * 200, if you use 30 images, use 6000 steps, if you're not satisfied with the result, resume training for another 500 steps, and so on ...
Seed:
Insert text here
Leave empty for a random seed.
Resolution:

512
Higher resolution = Higher quality, make sure the instance images are cropped to this selected size (or larger), if you're getting memory issues, check the box below (slower speed but memory effecient) :
Reduce_memory_usage:

fp16:

Enable/disable half-precision, disabling it will double the training time and produce 4.7Gb checkpoints.
Enable_text_encoder_training:

At least 10% of the total training steps are needed, it doesn't matter if they are at the beginning or in the middle or the end, in case you're training the model multiple times.
For example you can devide 5%, 5%, 5% on 3 training runs on the model, or 0%, 0%, 15%, given that 15% will cover the total training steps count (15% of 200 steps is not enough).
Enter the % of the total steps for which to train the text_encoder
Train_text_encoder_for:
100
Keep the % low for better style transfer, more training steps will be necessary for good results.
Higher % will give more weight to the instance, it gives stronger results at lower steps count, but harder to stylize,
Save_Checkpoint_Every_n_Steps:

Save_Checkpoint_Every:
500
Minimum 200 steps between each save.
Start_saving_from_the_step:
1500
Start saving intermediary checkpoints from this step.

Thanks!

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:15 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
TheLastBencommented, Nov 18, 2022

@Xtreamer if you set it to 100, it will train 15800s the textenc and 15800s the unet, set it to something like 30%.

don’t use the “contains_faces” option, it’s experimental and doesn’t give much results

1reaction
TheLastBencommented, Nov 13, 2022

I’ll check that out

Read more comments on GitHub >

github_iconTop Results From Across the Web

Anyone know why a .ckpt is not saving after fine-tuning run on ...
And it goes through a fine-tuning loop, but no checkpoint is saved in the output folder. Any know how to fix? Many thanks....
Read more >
Stable Diffusion Tutorial Part 2: Using Textual Inversion ...
In this tutorial, we will show how to train Textual Inversion on a pre-made set of images from the same data source we...
Read more >
DreamBooth fine-tuning example - Hugging Face
It's easy to overfit while training with Dreambooth, so sometimes it's useful to save regular checkpoints during the process.
Read more >
How to Use DreamBooth to Fine-Tune Stable Diffusion (Colab)
In this tutorial, we'll cover the basics of fine-tuning Stable Diffusion with DreamBooth to generate your own customized images using Google Colab, for...
Read more >
Running Stable Diffusion on Windows with an AMD GPU
Take a look, you first create a variable and named it "num_images" and said that it is a number and the number is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found