question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Speech to image pipeline, Unexpected output, green image

See original GitHub issue

Describe the bug

Resuting image is greenish

Reproduction

import torch

import matplotlib.pyplot as plt
from datasets import load_dataset
from diffusers import DiffusionPipeline
from transformers import (
    WhisperForConditionalGeneration,
    WhisperProcessor,
)


device = "cuda" if torch.cuda.is_available() else "cpu"

ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")

audio_sample = ds[3]

text = audio_sample["text"].lower()
speech_data = audio_sample["audio"]["array"]

model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to(device)
processor = WhisperProcessor.from_pretrained("openai/whisper-small")

diffuser_pipeline = DiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    custom_pipeline="speech_to_image_diffusion",
    speech_model=model,
    speech_processor=processor,
    revision="fp16",
    torch_dtype=torch.float16,
)

diffuser_pipeline.enable_attention_slicing()
diffuser_pipeline = diffuser_pipeline.to(device)

output = diffuser_pipeline(speech_data)
plt.imsave('aa.png',output.images[0])

the results seems to be misaligned image aa

Logs

No response

System Info

  • diffusers version: 0.6.0
  • Platform: Linux-4.15.0-142-generic-x86_64-with-glibc2.23
  • Python version: 3.9.13
  • PyTorch version (GPU?): 1.8.1+cu101 (True)
  • Huggingface_hub version: 0.10.1
  • Transformers version: 4.23.1
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?:No

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
MikailINTechcommented, Oct 26, 2022

@MikailINTech do you mind taking a look here? 😃

I’ll have a look at it, thanks for tagging me

0reactions
darwinhariantocommented, Oct 27, 2022

Sorry, I found it I wasn’t supposed to plt.imsave('aa.png',output.images[0]) but output.images[0].save('bb.png')

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add Image Output Resources #216 - tektoncd/pipeline - GitHub
Create a Pipeline that has 2 Tasks, one that builds an image and another that does something with that image using the digest...
Read more >
Fight against palm snares unexpected users - Reuters
The South Pacific island nation, which prides itself on its green image, has become a top buyer of palm kernel expeller or PKE,...
Read more >
Vox - Understand the News
Vox is a general interest news site for the 21st century. Its mission is simple: Explain the news. Politics, public policy, world affairs,...
Read more >
POLITICO Playbook
Alexey Furman/Getty Images ... HOLIDAY SURPRISE — Ukrainian President VOLODYMYR ZELENSKYY is set to visit Washington today ... Karen Berg, in a statement....
Read more >
FOX 32 Chicago
Chicago news, weather, traffic, and sports from FOX 32, serving the Chicago area and Northwest Indiana. Watch breaking news live or see the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found