Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Masking not working in training, thanks

See original GitHub issue

Hi, I have tried to train the model on GPU with masking enabled. The line 94 t = torch.flip(t, dims = (2,)) reports an error: RuntimeError: “flip_cuda” not implemented for ‘Bool’, even though I have tried to move mask to CPU.

Any ideas to solve the problem? Thanks a lot.

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

junyongyoucommented, Nov 24, 2021

@junyongyou ohh you caught another bug, this time with another library - do you want to check if 0.1.2 https://github.com/lucidrains/rotary-embedding-torch fixed this issue?

Thanks a lot for the info. I will check that. If not working, I will post an issue again. Let’s close this issue and thanks again for your help.

1reaction

junyongyoucommented, Nov 24, 2021

Dear @lucidrains I can confirm that the new change fixed the problem of masking, my train can go without problems. Thanks a lot for your efforts.

However, there is another issue when I train the model on multi-GPUs. \site-packages\rotary_embedding_torch\rotary_embedding_torch.py", line 45, in apply_rotary_emb t = (t * freqs.cos()) + (rotate_half(t) * freqs.sin()) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

I am not sure if this is a known issue. But at least I can train it on a single GPU. So we can close this issue. Thank you very much.