Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to initialize NumPy / Numpy is not available

See original GitHub issue

Greetings.

First of all, very stellar work on this. This looks to be very helpful to use in a personal project I’m working on.

With that, I’d like to get this setup on my local computer. I’m running a GTX 1060 6gb and I haven’t had much issues with running CUDA enabled software in the past. I’m using an environment created with Anaconda running Python 3.10.4 on Windows 10 to run this code.

At first, I had a few troubles getting torch to work with my GPU, but it was an easy fix that involved appending +cu113 on two of the torch packages.

However, now I’ve ran into another problem that I can’t quite seem wrap my head around. When executing the “do_tts.py” command, I run into this:

(TortTTS2) PS D:\git\tortoise-tts> python do_tts.py --text "I'm going to speak this" --voice daniel --preset fast

D:\Conda\envs\TortTTS2\lib\site-packages\torch\_masked\__init__.py:223: UserWarning: Failed to initialize NumPy: module compiled against API version 0xf but this version of numpy is 0xe (Triggered internally at  ..\torch\csrc\utils\tensor_numpy.cpp:68.)
  example_input = torch.tensor([[-3, -2, -1], [0, 1, 2]])
Removing weight norm...
Generating autoregressive samples..
100%|████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00,  1.35s/it]
Computing best candidates using CLVP and CVVP
  0%|                                                                                            | 0/6 [00:00<?, ?it/s]D:\Conda\envs\TortTTS2\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
100%|████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00,  1.42s/it]
Transforming autoregressive outputs into audio..
D:\git\tortoise-tts\utils\stft.py:119: FutureWarning: Pass size=1024 as keyword args. From version 0.10 passing these as positional arguments will result in an error
  fft_window = pad_center(fft_window, filter_length)
Traceback (most recent call last):
  File "D:\git\tortoise-tts\do_tts.py", line 32, in <module>
    gen = tts.tts_with_preset(args.text, conds, preset=args.preset, clvp_cvvp_slider=args.voice_diversity_intelligibility_slider)
  File "D:\git\tortoise-tts\api.py", line 225, in tts_with_preset
    return self.tts(text, voice_samples, **kwargs)
  File "D:\git\tortoise-tts\api.py", line 365, in tts
    mel = do_spectrogram_diffusion(self.diffusion, diffuser, latents, voice_samples, temperature=diffusion_temperature, verbose=verbose)
  File "D:\git\tortoise-tts\api.py", line 136, in do_spectrogram_diffusion
    cond_mel = wav_to_univnet_mel(sample.to(latents.device), do_normalization=False)
  File "D:\git\tortoise-tts\utils\audio.py", line 138, in wav_to_univnet_mel
    stft = TacotronSTFT(1024, 256, 1024, 100, 24000, 0, 12000)
  File "D:\git\tortoise-tts\utils\audio.py", line 101, in __init__
    self.stft_fn = STFT(filter_length, hop_length, win_length)
  File "D:\git\tortoise-tts\utils\stft.py", line 120, in __init__
    fft_window = torch.from_numpy(fft_window).float()
RuntimeError: Numpy is not available

For the top error, I’ve found that apparently it can be fixed by upgrading NumPy. Unfortunately, I can’t do so since Numba requires a version of NumPy that’s lower than 1.22 and greater than 1.18

I get this when upgrading NumPy:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
numba 0.55.0 requires numpy<1.22,>=1.18, but you have numpy 1.22.3 which is incompatible.

I’ll be honest, when going into this I wasn’t even sure what python version I should be using. I have a thought that it might be something to do with my torch installation, but I’m not so sure. Any help towards fixing this will be much appreciated!

Issue Analytics

State:
Created a year ago
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

neonbjbcommented, May 3, 2022

Thanks for the continued help @honestabelink. I’m pretty surprised this fits under 4GB given that the diffusion model is a memory hog. One other suggestion to keep memory utilization at bay is to use read.py and adjust desired_length and max_len downwards: https://github.com/neonbjb/tortoise-tts/blob/main/tortoise/read.py#L11

Tortoise should work perfectly fine with these values halved, with perhaps a slight degradation in quality.

I need to write up a wiki on troubleshooting that encompasses all these tips. Would welcome help. Will leave this open till that’s done.

@Path-A I have also run into these Numba issues in the past on Windows. It’s a shame because it’s a transitive dependency forced by librosa, and the only reason I use librosa is because the univnet vocoder that I’m using was trained using librosa’s MEL function. The rest of Tortoise uses a MEL function exported by torchaudio, which is close to librosa’s but not the same (despite the docs…). I may someday train my own vocoder to get rid of this dep.

1reaction

honestabelinkcommented, May 3, 2022

Another point of failure can be from generating the conditioning latents.

https://github.com/neonbjb/tortoise-tts/blob/52e912f5d1c013da6c99362bd50cb404ac80024f/tortoise/models/autoregressive.py#L390-L398

This is because for every sample is accumulated on the gpu, if you have many samples or longer samples you may hit OOM.

Something like,

for j in range(speech_conditioning_input.shape[1]):
    conds.append(self.conditioning_encoder(speech_conditioning_input[:, j]).cpu().detach())
...
return conds.cuda()

Allows for a ton of samples, though you shouldn’t really be needing that many.

@neonbjb Adjusting the batch size, deleting/clearing unused torch tensors along the way, I never saw the model go above 4gb.

@rkjoan If you still run into issues feel free to post a log 😄