Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

method to prepare offline cache

See original GitHub issue

.from_pretrained() can work in offline mode by loading from the cache, but we lack a method to explicitly populate that cache.

I’d like something along the lines of .cache_pretrained_for_offline() that fetches all the files necessary for .from_pretrained() but doesn’t actually load the weights.

Use cases would include:

something you do in an installation step or a “prepare for offline use” action to avoid loading delays later in the application, or in anticipation of network access becoming unavailable.
preparing an environment (archive, container, disk image, etc) on a low-resource machine that will then be copied over to the high-spec machine for production use.

It should be able to run without a GPU (or other intended target device for the model) or heaps of RAM.

The advantage of populating the huggingface_hub cache with the model instead of saving a copy of the model to an application-specific local path is that you get to share that cache with other applications, you don’t need any extra code to apply updates to your copy, you don’t any switch to change from the default on-demand loading location to your local copy, etc.

Issue Analytics

State:
Created 10 months ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

keturncommented, Nov 21, 2022

Is it just so that weights are not unecessarily loaded into RAM

Mostly this, yes. It’s a lot of RAM.

And also when you try to load fp16 weights without a CUDA device, it spams a lot of verbose warnings (probably one for each sub-model with weights) that I haven’t found a clean way to suppress.

would it help to try to convince the authors of stable diffusion to remove the “Request Access” mechanism

That’d be great for other reasons we’ve discussed, but is only tangentially related here. We have a requirement to make a self-contained installation for offline mode regardless of license.

0reactions

patrickvonplatencommented, Nov 28, 2022

I see! This PR could help a bit: https://github.com/huggingface/diffusers/pull/1450

However it still forces one to load the model into RAM but doing something like:

from diffusers import DiffusionPipeline
import gc

_, local_path = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2", return_cached_folder=True)
gc.collect()

would be a simple fix for now. In the future we could factor out the whole downloading function. But since this function is still very prone to change and I don’t see the use case of 0-RAM downloading the models as very important at the moment, I’d prefer to have one long, readable from_pretrained function for now.