img2img results in noisy image with low num_inference_steps
See original GitHub issueI’m trying to make a video with the StableDiffusionImg2ImgPipeline
but am encountering some unexpected behavior.
I generate the first frame by using the StableDiffusionPipeline
with relatively many steps (e.g. 100). This frame is then slightly warped (e.g. rotation, translation, zoom). Next I feed the frame to StableDiffusionImg2ImgPipeline
to slightly refine the warped image. The translated image will have empty sides now (black pixels) which I expect to be made coherent.
However, when using a small num_inference_steps
, the image becomes very noisy. Using more num_inference_steps
results in a completely different image. I realize I can use the strength
to module this behavior but since it is essentially the same as changing num_inference_steps
, it doesn’t help.
I realize that inpainting could help but this only works for translate and rotate since for zoom there are no black pixels (mask).
Is there any setting for which this video generation would be smooth? Or is the StableDiffusionImg2ImgPipeline
just not suited for this application/task?
Issue Analytics
- State:
- Created 10 months ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Ok, I got your point. The issue for this behavior is because it is not the same if you set:
num_inference_steps = 2
andstrength = 1.
, andnum_inference_steps = 10
andstrength = 0.2
.If you look at the pipe code you can see they call
self.scheduler.set_timesteps
before callself.get_timesteps
:https://github.com/huggingface/diffusers/blob/f07a16e09bb5b1cf4fa2306bfa4ea791f24fa968/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L532-L533
If you go to the scheduler code (PNDM in this case), you will see that the value for
num_inference_steps
changes the result ofstep_ratio = self.config.num_train_timesteps // self.num_inference_steps
https://github.com/huggingface/diffusers/blob/f07a16e09bb5b1cf4fa2306bfa4ea791f24fa968/src/diffusers/schedulers/scheduling_pndm.py#L157This is why you are seeing this amount of noise. My recommendation in this case is to use a higher
num_inference_steps
and set a smallstrength
, something like:num_inference_steps = 20
andstrength = 0.2
.I hope it helps you in some way 😃
Can only second @pedrogengo here, the
strength
parameter is way too strong