question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reduce Stable Diffusion memory usage by keeping unet only on GPU.

See original GitHub issue

Is your feature request related to a problem? Please describe. Stable Diffusion is not compute heavy on all its steps. If we keep the diffusion unet on fp16 on GPU and everything else on CPU, we could reduce the GPU usage to 2.2GB while having a non-so-big impact on performance. It should democratize Stable Diffusion even further.

Only other thing that would need to be done is move the tensors from the devices accordingly, but we can use the models device and dtype attributes to make everything work.

Describe the solution you’d like I think what I’m proposing on https://github.com/huggingface/diffusers/pull/537 should be enough.

Describe alternatives you’ve considered Alternative is to use GPUs for the whole process and pay more for it.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
patrickvonplatencommented, Oct 7, 2022

Hey @piEsposito,

I’m wondering whether we could maybe try to just write a community pipeline for this: https://github.com/huggingface/diffusers/tree/main/examples/community

1reaction
piEspositocommented, Oct 5, 2022

I’ve created a feature request on accelerate to enable solving this in a more elegant way. If they let me work on the feature, I can open a PR and then try solving this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory and speed - Hugging Face
We present some techniques and ideas to optimize Diffusers inference for memory or speed. As a general rule, we recommend the use of...
Read more >
Running Stable Diffusion on Your GPU with Less Than 10Gb ...
I was looking at GPU graphs and neglected my physical RAM. This machine only has 32G and I didn't notice I was hitting...
Read more >
How to Fine-tune Stable Diffusion using Dreambooth
This tutorial focuses on how to fine-tune Stable Diffusion using another method ... optimized the code to reduce VRAM usage to under 16GB....
Read more >
Command Line stable diffusion runs out of GPU memory but ...
My use case is i want it to execute to completion even if it takes much longer on my CPU as my machine...
Read more >
Stable Diffusion Tutorial Part 1: Run Dreambooth in Gradient ...
Note: You will need at least 16 GB of GPU RAM to run this model training. The P5000, P6000, V100, V100-32G, RTX5000, A4000,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found