DreamBooth fails when training with 8bit adam optimizer.
See original GitHub issueDescribe the bug
Seems that 8bit adam optimizer fails when training dreambooth.
The command I used:
python train_dreambooth.py
–pretrained_model_name_or_path=“CompVis/stable-diffusion-v1-4”
–instance_data_dir=$instance_folder
–class_data_dir=$class_folder
–output_dir=$model_checkpoints_folder
–with_prior_preservation --prior_loss_weight=1.0
–instance_prompt=“a photo of sks dog”
–class_prompt=“a photo of dog”
–resolution=512
–train_batch_size=1
–gradient_accumulation_steps=2
–gradient_checkpointing
–use_8bit_adam
–learning_rate=5e-6
–lr_scheduler=“constant”
–lr_warmup_steps=0
–max_train_steps=800
Reproduction
No response
Logs
RuntimeError: set_sizes_and_strides is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in a `with torch.no_grad():` block.
For example, change:
x.data.set_(y)
to:
with torch.no_grad():
x.set_(y)
System Info
diffusers
version: 0.4.0- Platform: Linux-5.8.0-44-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- PyTorch version (GPU?): torch 1.10
- Huggingface_hub version: 0.10.0
- Transformers version: 4.22.2
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
Issue Analytics
- State:
- Created a year ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Dreambooth broken, possibly because of ADAM optimizer ...
Ya imagine that's impossible if the entire latent space is getting regularized and forgetting everything it learned in its original training.
Read more >[D] Dreambooth Stable Diffusion training in just 12.5 GB ...
[D] Dreambooth Stable Diffusion training in just 12.5 GB VRAM, using the 8bit adam optimizer from bitsandbytes along with xformers while being 2 ......
Read more >Accelerate + Multi-GPU+ Automatic1111 + Dreambooth ...
I'm currently trying to use accelerate to run Dreambooth via ... 8 Bit Adam = Yes ... Applying xformers cross attention optimization.
Read more >Stable Diffusion Tutorial Part 1: Run Dreambooth in Gradient ...
In this tutorial, we will walk step-by-step through the setup, training, and inference of a Dreambooth Stable Diffusion model within a ...
Read more >How to Fine-tune Stable Diffusion using Dreambooth
This tutorial focuses on how to fine-tune Stable Diffusion using another method called Dreambooth. Unlike textual inversion method which train ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@paudom, I faced the same issue in a conda environment created with
pytorch==1.10.2
andcudatoolkit=11.3.1
. I solved my issue by running it inside a docker container using their latest akanvcr.io/nvidia/pytorch:22.09-py3
@nanlliu Could you please tell me which Pytorch version or setup did you finally used? Thanks! I’m having the same issue.