DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7
See original GitHub issueDescribe the bug
I have noticed that the SD v1.4 or v1.5 works poorly if I swap the scheduler to DDIM and have guidance scale larger than 7. This behavior does not seems to be obvious on other sampler such as the default PNDM scheduler though. At first I suspect it is related to the “train-test-mismatch” mentioned in Imagen paper Sec 2.3. However, I found that that the same DDIM sampler in WebUI does not suffer from the same performance degradation with the same guidance scale. I have manually traced the scales of the predicted epsilon value under the same prompt/guidance scale in both WebUI DDIM sampler and diffuser DDIM sampler, they are all within simialr range of [-4, 4]. I have also printed out the beta schedules of both but they are very very close( of course I tried replacing diffusers DDIM betas with WebUI’s DDIM betas but it does not help )
Reproduction
Reproduce is easy, and the behavior of following code snippet is consistent across diffusers versions from 0.3 to 0.9
from diffusers import StableDiffusionPipeline, DDIMScheduler
# swap any SD model here
pipe = StableDiffusionPipeline.from_pretrained('/xxxx/stable-diffusion-v1-5').to(0)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
# change the guidance scale from 1 - 15 and observe the performance degradation
image = pipe(prompt=prompt, num_inference_steps=100, guidance_scale=10., strength=1.).images[0]
guidance_scale = 2.5
guidance_scale = 5.0
guidance_scale = 7.5
guidance_scale = 10.
Logs
No response
System Info
Both diffusers == 0.3.0 and diffusers == 0.9.0 suffer from such issue Also SD v1-4 and v1-5 suffer from such issue
Issue Analytics
- State:
- Created 9 months ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
@patrickvonplaten OMG!! Yes this is exactly the reason why it is failing !!! Thanks a lot for finding it out. This bug really tortures me for a week ! I checked everything but just did not check the clipping and that’s why the DDIM is failing because its normal range is [-4,4] and clipping makes all the guidance signal go away! Thanks a lot for finding it out, I will close this issue now.
Here the PR that updates the conversion script: https://github.com/huggingface/diffusers/pull/1667