question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Latents / seeds are a mess. Make it easier to replicate a generated image using a seed.

See original GitHub issue

Problem:

We often generate images with a batch_size >1.

However, images in the batch (after the first image) by default have an seed that is unknown to the user, so all but the first image in a batch can’t be directly replicated.

To get around this, the docs suggest that we manually feed in latents.

What’s a latent?? Like most devs, I’m arriving here with zero domain expertise.

But whatever, I figured it out (a latent seems to be an image of white noise, generated from a seed, which the diffuser looks at to begin dreaming up its image), and I did as I was told.

I decided it would make sense for the seeds in a batch to be sequential, so, for any given batch of images, if you specify that txt2img(prompt="astronaut riding horse", myManualSeed= 42069, batch_size=6), the second image in the batch can be replicated with the seed yourManualSeed + 1, and so on:

def getSequentialLatents(settings:DreamSettings,pipe=txt2imgPipe):
  theDevice="cuda"
  generator = torch.Generator(device=theDevice)
  batchWidth = settings.batchWidth
  width = settings.width
  height = settings.height
  latents = None
  thisSeed=settings.seed
  for _ in range(batchWidth):
    generator = generator.manual_seed(thisSeed)
    newLatent = torch.randn(
          (1, pipe.unet.in_channels, height // 8, width // 8),
          generator = generator,
          device = theDevice
      )
    latents = newLatent if latents is None else torch.cat((latents, newLatent))
    thisSeed += 1
  return latents

This was a pain to merely know the seeds that are present in a batch! This is a basic need for generating and refining images, and as such I believe this should be under the hood.

Furthermore, this hacky solution doesn’t work for img2img, as “latents” can’t be specified!

img2imgPipe(latents = sequentialLatents)
---------------------------------------------------------------------------
TypeError: __call__() got an unexpected keyword argument 'latents'

So, there is currently no easy way to know the seeds that make up your batch in img2img. If you want to perform more inference steps specifically on the second image in an img2img batch, you’re out of luck.

Proposed solution:

Make manual_seed() by default create sequential seeds for a batch, as I have sketched out above. Make manual_seed() universally do this, for txt2img, img2img, and inpainting.

Then, if you specify a seed for a batch, you will know that the second image of a batch will be (seed+1), and so on. Simple and easy.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:20 (15 by maintainers)

github_iconTop GitHub Comments

3reactions
patrickvonplatencommented, Dec 15, 2022

https://github.com/huggingface/diffusers/pull/1718 should solve this. Also adding a nice doc page for it.

3reactions
keturncommented, Sep 29, 2022

My favorite idea for this so far is to use a coordinate-based noise system.

A torch.Generator is a one-dimensional function, and it has internal state that advances its position every time it’s called.

A coordinate-based function would look more like this:

def noise(position, shape, seed) -> np.ndarray:

for use like

latents = noise(
    position = (0, 0, 0),
    shape = (4, height, width), 
    seed = 42,
)

to say “give me the three-dimensional box of noise(seed=42) that starts at (0, 0, 0) and is 4 layers deep, height tall and width wide.”

That’s an example I developed for three dimensions. A slightly different use case but it runs in to the same problems we’ve been discussing: if you change width with a one-dimensional noise generator, then everything gets all out of place, even if you just wanted to make things 12% wider. Or shift them to the left a bit. etc.

Adding another few dimensions — instead of (channel, height, width), using (step, batch_index, channel, height, width) — would enable us to do things like

channels = 4
noise(
    position = (  # starting from
        4, # step
        2, # batch entry
        0, 
        0, 
        0,
    ),
    shape = (
        1, # one step's worth
        3, # three consecutive batch items [2, 3, 4]
        channels, 
        width,
        height,
    ), 
    seed=42
)

Being explicit about the dimensionality and shape of the noise makes it a lot easier to reproduce later.

The major caveats being:

  • This would still require some changes to the way schedulers access noise functions.
  • While we can make a procedural noise function like this that’s consistent across platforms, the seeds used for this are absolutely not going to be comparable with the seeds used by torch.Generator.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Exploring the StyleGAN Latent Vector (7.3) - YouTube
StyleGAN seeds, such as 3000, are only random number seeds used to generate much longer 512-length latent vectors, which create the GAN ...
Read more >
Stable Diffusion - Quick tip for using seeds when you want ...
Stable Diffusion installations including Dream Studio are pretty flexible compared to MidJourney in allowing us to modify our prompt and ...
Read more >
Doubt about the seeds : r/StableDiffusion - Reddit
This is a deterministic process which you can experience by re-rendering images with slightly different prompts while keeping the seed and ...
Read more >
5 Fatal Mistakes For Germinating Seeds - You Should Grow
Have you ever had peppers or tomato seedling leaves get stuck inside their hard seed coat after germinating? I have, and it is...
Read more >
Disco Diffusion Cheatsheet - Eliso's Generative Art Guides
Diffusion is a mathematical process for removing noise from an image. ... with complex interactions and few limits, so it's easy to get...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found