Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CLIP model support for pipelines

See original GitHub issue

I am not sure if this has been considered, or already on the roadmap, but I’d love to be able to just throw a CLIP model ID at a pipe and have it download, and use said model.

I see HF has the capability to do this, and I have seen the clip guided diffusion example Colab, but it seems to be just implemented for text2img and not all the pipelines.

I tried searching PRa and issues and repo from anything related to it being planned, but I don’t see anything.

On a side note, it would be cool to see pipes merged and call the necessary modes via enum or something like pipe.IMG2IMG as first param for a pretrained pipe setup.

PS loving these latest commits! Wow! Getting faster, and more flexibility.

Issue Analytics

State:
Created a year ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

dblunk88commented, Oct 17, 2022

I’ve been thinking for a long time that pipes [of a type] should be merged.

Stable Diffusion pipeline for example should have all it’s basic pipelines as one Stable Diffusion pipeline

StableDiffusionPipeline (similar to how it is now for text2img) except from this method one can use a ID, or ENUM to define the type of pipe it is.

This means the new StableDiffusionPipeline able to dynamically accept optional init images or masks on the basis of the mode it’s in.

Now StableDiffusionPipeline is like a metropolitan office building. It has levels (modes) you can quickly access. It’s no longer like a rural office park complex where you must take a golf cart and drive on over to the next building for that department. 😃

It’s a effecient structure for high input/output departments that need to work together (such as text2img -> img2img/inpaint workflows).

You wouldn’t need make sperate pipes, or overwrite pipes but can dynamically change the mode it’s fed. Having one pipe dynamically accepting and doing the work of all 3 methods.

I feel the benefits here are:

File structure bloat is reduced with pipes of a type merged

Community extensions of said pipes will be all-encompassing to current diffuser pipe offerings without reimplementing the same code across multiple pipes to maintain

Access on the users API side is improved and can reduce script bloat on pipe setups

Code can be more effecient and effective using a dynamic approach to what pipe is used via ID/Enum

Happy HF Community ☺️

Cons:

Users would no doubt need to adjust their current implementations (but in diffusers infancy, this has happened frequently already as it matures)

???

I love that idea

1reaction

patil-surajcommented, Oct 7, 2022

Hi, thanks for the issue!

It may not be possible to support clip guidance in all pipelines as we want to keep the pipelines simple so any user can modify it according to their needs. pipelines are meant to be an example of how a certain task can be done, so they may not support all the functionality. We encourage users to take the pipelines and modify it according to their needs.

Also, we just released community pipelines, which will allow any community pipeline to be loaded from diffusers, you can see how to load the clip guided pipeline easily in the doc

from diffusers import DiffusionPipeline
from transformers import CLIPFeatureExtractor, CLIPModel

clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"

feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id)
clip_model = CLIPModel.from_pretrained(clip_model_id)

pipeline = DiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    custom_pipeline="clip_guided_stable_diffusion",
    clip_model=clip_model,
    feature_extractor=feature_extractor,
)

with this anyone can share their pieplines with the community very easily. Can’t wait to see a community contribution for the merged pipelines, we can add it in the community pipelines here https://github.com/huggingface/diffusers/tree/main/examples/community. Feel free to take a shot at it if you want, happy to help 😃