question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7

See original GitHub issue

Describe the bug

I have noticed that the SD v1.4 or v1.5 works poorly if I swap the scheduler to DDIM and have guidance scale larger than 7. This behavior does not seems to be obvious on other sampler such as the default PNDM scheduler though. At first I suspect it is related to the “train-test-mismatch” mentioned in Imagen paper Sec 2.3. However, I found that that the same DDIM sampler in WebUI does not suffer from the same performance degradation with the same guidance scale. I have manually traced the scales of the predicted epsilon value under the same prompt/guidance scale in both WebUI DDIM sampler and diffuser DDIM sampler, they are all within simialr range of [-4, 4]. I have also printed out the beta schedules of both but they are very very close( of course I tried replacing diffusers DDIM betas with WebUI’s DDIM betas but it does not help ) Screenshot 2022-12-08 at 15 08 04

Reproduction

Reproduce is easy, and the behavior of following code snippet is consistent across diffusers versions from 0.3 to 0.9

from diffusers import StableDiffusionPipeline, DDIMScheduler
# swap any SD model here
pipe = StableDiffusionPipeline.from_pretrained('/xxxx/stable-diffusion-v1-5').to(0)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
# change the guidance scale from 1 - 15 and observe the performance degradation
image = pipe(prompt=prompt, num_inference_steps=100, guidance_scale=10., strength=1.).images[0]

guidance_scale = 2.5 Screenshot 2022-12-08 at 15 19 25

guidance_scale = 5.0 Screenshot 2022-12-08 at 15 20 10

guidance_scale = 7.5 Screenshot 2022-12-08 at 15 22 16

guidance_scale = 10. Screenshot 2022-12-08 at 15 21 10

Logs

No response

System Info

Both diffusers == 0.3.0 and diffusers == 0.9.0 suffer from such issue Also SD v1-4 and v1-5 suffer from such issue

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Randolph-zengcommented, Dec 13, 2022

@patrickvonplaten OMG!! Yes this is exactly the reason why it is failing !!! Thanks a lot for finding it out. This bug really tortures me for a week ! I checked everything but just did not check the clipping and that’s why the DDIM is failing because its normal range is [-4,4] and clipping makes all the guidance signal go away! Thanks a lot for finding it out, I will close this issue now.

0reactions
patrickvonplatencommented, Dec 12, 2022

Here the PR that updates the conversion script: https://github.com/huggingface/diffusers/pull/1667

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I run Stable Diffusion and sharing FAQs - Reddit
A remote computer by google with powerful GPU for calculation. There is a free version (not sure it is powerful enough to run...
Read more >
CompVis/stable-diffusion: A latent text-to-image ... - GitHub
Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder. We provide a reference...
Read more >
Deforum Settings Explained - Part 1 Stable Diffusion ...
In this Video I will explain the Deforum Settings for Video Rendering with Stable Diffusion. We will look at the Render Settings, Sampling, ......
Read more >
Update to diffusers backend · stabilityai/stable ... - Hugging Face
from ldm.models.diffusion.ddim import DDIMSampler ... 67. 68. - #When running locally, you won`t have access to this, so you can remove this ...
Read more >
Deforum Stable Diffusion Settings - Dreaming Computers
Deforum Stable Diffusion is a Google Notebook which leverages an AI Image generating technique called Latent Diffusion to allow you to create compelling...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found