Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

img2img results in noisy image with low num_inference_steps

See original GitHub issue

I’m trying to make a video with the StableDiffusionImg2ImgPipeline but am encountering some unexpected behavior. I generate the first frame by using the StableDiffusionPipeline with relatively many steps (e.g. 100). This frame is then slightly warped (e.g. rotation, translation, zoom). Next I feed the frame to StableDiffusionImg2ImgPipeline to slightly refine the warped image. The translated image will have empty sides now (black pixels) which I expect to be made coherent.

However, when using a small num_inference_steps, the image becomes very noisy. Using more num_inference_steps results in a completely different image. I realize I can use the strength to module this behavior but since it is essentially the same as changing num_inference_steps, it doesn’t help.

I realize that inpainting could help but this only works for translate and rotate since for zoom there are no black pixels (mask).

Is there any setting for which this video generation would be smooth? Or is the StableDiffusionImg2ImgPipeline just not suited for this application/task?

Issue Analytics

State:
Created 10 months ago
Comments:5 (3 by maintainers)

Top GitHub Comments

2reactions

pedrogengocommented, Nov 23, 2022

Ok, I got your point. The issue for this behavior is because it is not the same if you set: num_inference_steps = 2 and strength = 1., and num_inference_steps = 10 and strength = 0.2.

If you look at the pipe code you can see they call self.scheduler.set_timesteps before call self.get_timesteps:

https://github.com/huggingface/diffusers/blob/f07a16e09bb5b1cf4fa2306bfa4ea791f24fa968/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L532-L533

If you go to the scheduler code (PNDM in this case), you will see that the value for num_inference_steps changes the result of step_ratio = self.config.num_train_timesteps // self.num_inference_steps https://github.com/huggingface/diffusers/blob/f07a16e09bb5b1cf4fa2306bfa4ea791f24fa968/src/diffusers/schedulers/scheduling_pndm.py#L157

This is why you are seeing this amount of noise. My recommendation in this case is to use a higher num_inference_steps and set a small strength, something like: num_inference_steps = 20 and strength = 0.2.

I hope it helps you in some way 😃

1reaction

patrickvonplatencommented, Nov 29, 2022

Can only second @pedrogengo here, the strength parameter is way too strong

Top Results From Across the Web

diffusers/README.md at main · huggingface/diffusers

This notebook takes a step-by-step approach to training your diffusion models on an image dataset, with explanatory graphics. Stable Diffusion is fully ...

Image degrades to AI noise when trying to use img2img ...

I'm trying to use a low --strength setting for img2img to try to improve an already existing image (e.g., applying a particular artist's ......

Stable Diffusion with 🧨 Diffusers

Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from CompVis, Stability AI and LAION.

Image-to-Image - InvokeAI Stable Diffusion Toolkit Docs

This is done by passing the first generated image back into img2img the requested number of times. It generates interesting variants.

How img2img Diffusion Works

The Image/Noise Strength Parameter; 2.3. ... a low-quality version of what they're trying to make), and then feed this through Stable ...