Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error with shapes in LocalAttention

See original GitHub issue

Hi, recently updated to the recent version of the package (0.22.3) and caught a bug with shapes in LocalAttention’s forward() method (reformer_pytorch.py#L502) when both num_mem_kv and n_local_attn_heads are non-zero:

    501         out = torch.einsum('bhij,bhje->bhie', attn, bv)
--> 502         out = out.reshape(b, t, e)
    503         return out
    504 

RuntimeError: shape '[2, 2048, 64]' is invalid for input of size 278528

When any of num_mem_kv/n_local_attn_heads is zero, the model works fine.

Minimal example for reproducing the error:

import torch
from reformer_pytorch import ReformerLM

num_tokens, seq_len = 1024, 2048
model = ReformerLM(
    num_tokens=num_tokens,
    dim=512,
    depth=6,
    max_seq_len=seq_len,
    num_mem_kv=128,
    n_local_attn_heads=2,
).cuda()

x = torch.randint(0, num_tokens, (1, seq_len)).long().cuda()
y = model(x)

Hope that helps! Thanks for your work on the package!

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

ilya16commented, Apr 24, 2020

@lucidrains for now I did not experiment much with local attention, just got back to Reformer and updated to the recent version. If I find any strange or positive behaviour, I will report it.

0reactions

lucidrainscommented, Apr 24, 2020

@ilya16 please do!