Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

44.1K Sample Rate Strategies

See original GitHub issue

Here is a 44.1k sample rate clip, trained ~50k steps on VCTK speaker 280 (with a 100k sample size).

Any suggestions for how to improve it? I’m noticing:

Those pops / that top end distortion. Sounds sort of like zero crossing pops to me?
Also this clip sounds like it has a repetitive short window, moreso than other clips I’m hearing? I think this is a sample size issue? I was bumping into problems trying to scale my sample size along with the sample rate (most people here are using 16k sample rate with a 100k sample size), I think they were memory problems. Will check into this tomorrow and test the checkpoints I already have with a wav_seed as well.

Settings I used were the ones @jyegerlehner posted here: https://github.com/ibab/tensorflow-wavenet/issues/47#issuecomment-250218850

{
"filter_width": 2,
"sample_rate": 44100,
"dilations": [1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512],
"residual_channels": 32,
"dilation_channels":32,
"quantization_channels": 256,
"skip_channels": 1024,
"use_biases": true
}

Issue Analytics

State:
Created 7 years ago
Comments:19 (9 by maintainers)

Top GitHub Comments

4reactions

chrisnovellocommented, Oct 17, 2016

@zeta36 The clip reproduces some of the character of my recorded dataset, for sure. Glitchier and otherworldly (I suspect a larger receptive field would help).

“Did you record yourself saying the same phrases than in one VCTK speaker?” — yeah.

I re-recorded a vocal set from VCTK (toward future experiments of training on my voice + the full dataset). I added some compression and a little bit of de essing. Recorded on a $100 condenser mic in a living room (so yes some room sound in my dataset — more than the VCTK but not that much).

I found the VCTK txts awkward to read (I would never speak in the style they’re written) and thus the personality of my dataset is pretty announcerly. I’m planning to sample my own writing and do more passes in the near future.

2reactions

Zeta36commented, Oct 17, 2016

@paperkettle, some people in here #112 is beginning to develop the local conditioning. Your voice could be one of the first using and testing this feature. I recommend you forking the @alexbeloi development in this sense and make a try.

Regards.

Top Results From Across the Web

How to Change the Sample Rate of Your Audio - Voices

Click on the drop-down menu to the right of Frequency and change your audio sampling rate to 44.1 kHz. Google only accepts ads...

Understanding Audio Quality: Bit Rate, Sample Rate

The human hearing bandwidth is 20Hz-20kHz, the audio sampled can be at the rate above 40kHz. (Usually 44.1KHz is preferred).

Why do we choose 44.1 kHz as recording sampling rate?

You should use sampling rate more than 40 kHz because of anti-aliasing filters. You should have some reserve in frequency to prevent signal ......

This is how you change the sample rate to 44.1 khz on Android.

Correction: most recording studios actually operate at 48 kHz for recording, but the final masters for CDs and digital media indeed are exported ......

Need to increase sample rate but keep original pitch and speed.

'Save as' to 44.1k should not change pitch & speed. Click 'Save as' in the 'File' menu and select the file type, Then...