question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Community Pipeline] Modified Cross-Attention - Structured Diffusion Guidance for Compositional T2I synthesis

See original GitHub issue

Intro

Community Pipelines are introduced in diffusers==0.4.0 with the idea of allowing the community to quickly add, integrate, and share their custom pipelines on top of diffusers.

You can find a guide about Community Pipelines here. You can also find all the community examples under examples/community/. If you have questions about the Community Pipelines feature, please head to the parent issue.

Idea: Modified cross-attention mechanism

This pipeline aims to implement this paper to Stable Diffusion, improving interpretability of the prompts. Some results of the paper Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis: image

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
20RitikSinghcommented, Oct 17, 2022

I want to work on this.

3reactions
keturncommented, Oct 18, 2022

This is a super interesting paper! Note that there is a reference implementation included in the .zip of “supplementary material” on the paper submission. Also note that implementation is not released under the Apache License, so I don’t know if 🤗 can accept a PR that includes it. It might be safer to do a clean-room implementation from only the description in the text of the paper.

[I Am Not A Lawyer and I Am Not Your Lawyer, but the paper’s authors are obligated to remain anonymous until the end of the ICLR 2023 review period, so they might have a hard time speaking up for themselves right now.]

If you do choose to use the code from the supplementary material, you may need to swap some things around to make it better fit diffusers instead of ldm.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[2212.05032] Training-Free Structured Diffusion Guidance for ...
In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions. To ...
Read more >
Training-Free Structured Diffusion Guidance for ... - OpenReview
We propose to incorporate language structures with the cross-attention layers based on a recently discovered property of diffusion-based T2I models. Our method ...
Read more >
(PDF) SceneComposer: Any-Level Semantic Image Synthesis
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D ......
Read more >
/weixi-feng/ Training-Free Structured Diffusion Guidance for ...
In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions.
Read more >
Sensors | September-2 2022 - Browse Articles - MDPI
Unlike optical satellites, synthetic aperture radar (SAR) satellites can operate all day and in all weather conditions, so they have a broad range...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found