Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unconditional synthesis

See original GitHub issue

I"m running the this command to generate unconditional samples.

python -m diffwave.inference --fast /path/to/model -o output.wav

I’ve trained for almost 4k epochs on 7k+ sounds. I seem to get the same sound (or a very similar one) regardless of training time.

I have not worked with diffwave before - any tips for debugging this?

Thanks

Issue Analytics

State:
Created a year ago
Comments:5 (1 by maintainers)

Top GitHub Comments

2reactions

Rongjiehuangcommented, Jul 27, 2022

It seems that the Diffwave paper uses res_channel = 256 for unconditional speech synthesis (but we have 64 in this code), which is why we could not get reasonable sounds.

0reactions

Andrechangcommented, Jul 27, 2022

It shouldn’t output silence waves. When I trained shortly it generated noisy audio.

Top Results From Across the Web

Unconditional Synthesis of Complex Scenes Using a ...

Review: In this paper, the authors propose a new paradigm for unconditional image synthesis with semantic layouts as the bottleneck. The presented approach...

Unconditional Image Generation | Papers With Code

These leaderboards are used to track progress in Unconditional Image Generation ... High-Resolution Image Synthesis with Latent Diffusion Models.

Cluster-Guided Image Synthesis With Unconditional Models

In the typical GAN setting an image is synthesized by sampling a vector from a latent distribution and performing a forward pass through...

MoDi: Unconditional Motion Synthesis from Diverse Data - arXiv

Yet, learning to unconditionally synthesize motions from a given distribution remains a challenging task, especially when the motions are highly ...

Unconditional Latent Diffusion - Hugging Face

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis ...