question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CLIP model support for pipelines

See original GitHub issue

I am not sure if this has been considered, or already on the roadmap, but I’d love to be able to just throw a CLIP model ID at a pipe and have it download, and use said model.

I see HF has the capability to do this, and I have seen the clip guided diffusion example Colab, but it seems to be just implemented for text2img and not all the pipelines.

I tried searching PRa and issues and repo from anything related to it being planned, but I don’t see anything.


On a side note, it would be cool to see pipes merged and call the necessary modes via enum or something like pipe.IMG2IMG as first param for a pretrained pipe setup.

PS loving these latest commits! Wow! Getting faster, and more flexibility.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dblunk88commented, Oct 17, 2022

I’ve been thinking for a long time that pipes [of a type] should be merged.

Stable Diffusion pipeline for example should have all it’s basic pipelines as one Stable Diffusion pipeline

StableDiffusionPipeline (similar to how it is now for text2img) except from this method one can use a ID, or ENUM to define the type of pipe it is.

This means the new StableDiffusionPipeline able to dynamically accept optional init images or masks on the basis of the mode it’s in.

Now StableDiffusionPipeline is like a metropolitan office building. It has levels (modes) you can quickly access. It’s no longer like a rural office park complex where you must take a golf cart and drive on over to the next building for that department. 😃

It’s a effecient structure for high input/output departments that need to work together (such as text2img -> img2img/inpaint workflows).

You wouldn’t need make sperate pipes, or overwrite pipes but can dynamically change the mode it’s fed. Having one pipe dynamically accepting and doing the work of all 3 methods.

I feel the benefits here are:

  • File structure bloat is reduced with pipes of a type merged
  • Community extensions of said pipes will be all-encompassing to current diffuser pipe offerings without reimplementing the same code across multiple pipes to maintain
  • Access on the users API side is improved and can reduce script bloat on pipe setups
  • Code can be more effecient and effective using a dynamic approach to what pipe is used via ID/Enum
  • Happy HF Community ☺️

Cons:

  • Users would no doubt need to adjust their current implementations (but in diffusers infancy, this has happened frequently already as it matures)
  • ???

I love that idea

1reaction
patil-surajcommented, Oct 7, 2022

Hi, thanks for the issue!

It may not be possible to support clip guidance in all pipelines as we want to keep the pipelines simple so any user can modify it according to their needs. pipelines are meant to be an example of how a certain task can be done, so they may not support all the functionality. We encourage users to take the pipelines and modify it according to their needs.

Also, we just released community pipelines, which will allow any community pipeline to be loaded from diffusers, you can see how to load the clip guided pipeline easily in the doc

from diffusers import DiffusionPipeline
from transformers import CLIPFeatureExtractor, CLIPModel

clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"

feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id)
clip_model = CLIPModel.from_pretrained(clip_model_id)

pipeline = DiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    custom_pipeline="clip_guided_stable_diffusion",
    clip_model=clip_model,
    feature_extractor=feature_extractor,
)

with this anyone can share their pieplines with the community very easily. Can’t wait to see a community contribution for the merged pipelines, we can add it in the community pipelines here https://github.com/huggingface/diffusers/tree/main/examples/community. Feel free to take a shot at it if you want, happy to help 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

CLIP - Hugging Face
CLIP is a multi-modal vision and language model. It can be used for image-text similarity and for zero-shot image classification. CLIP uses a...
Read more >
Pipe Support & Clamps at Lowes.com
Find pipe support & clamps at Lowe's today. Shop pipe support & clamps and a variety of plumbing products online at Lowes.com.
Read more >
"zero-shot-image-classification" pipeline with ... - GitHub
works from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="openai/clip-vit-base-patch32") url ...
Read more >
Pipe Hangers - Pipe & Fitting Accessories - The Home Depot
Plastic PEX-B Pipe 90-Degree Bend Support with. Model# PXBEND34. (18). $354. Available for pickup. Pickup. 32 in stock at Huntington ...
Read more >
E-Z Line ® Adjustable Pipe Supports
The E-Z Line® Adjustable Pipe Support is available in a variety of head styles to ... Liners: FRP, PVC, PTFE ViBlon, I-Rod/I-Clip, Coated...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found