question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question on enc_input_mask and ignore_index / pad_value for EncDec.

See original GitHub issue

Sorry if this has already been answered, I am a little confuse on how to tell the Enc-Dec model to ignore the padding value, both for training and generating. Is ignore_index/pad_value sufficient enough or an additional enc_input_mask need to be feeded in too that label index of padding value False?

My implementation is similar to the following code

pad_tok = ... # The tokens for padding value
INPUT_DIM = 20000
OUTPUT_DIM = 20000
max_length = 256

model = ReformerEncDec(
    enc_num_tokens =  INPUT_DIM
    enc_max_seq_len = max_length,
    dec_num_tokens = OUTPUT_DIM,
    dec_max_seq_len = max_length,
    ignore_index = pad_tok,
    pad_value = pad_tok
).to(device)

Feedforwaring and training the model

optimizer = ...
# src =  (32, max_length)
# trg = (32, max_length)
...
# training
optimizer.zero_grad()
loss = model(src, trg, return_loss=True)
loss.backward()
optimizer.step()

And when to generate for a batch, I simply call the following

seq_out = torch.zeros((src.shape[0], 1)).long().to(device) 
sample = model.generate(src, seq_out, 
    seq_len = max_length, 
    input_mask=None)

Also, is it normal to take roughly half a minute to generate a batch size of 32, and 1 minute for size of 64? I have been using Kaggle Kernel (Tesla P100 16gb VRAM) for testing for the following parameters and same generate code block as above.

model = ReformerEncDec(
    dim = 64,
    enc_num_tokens = 20000,
    enc_depth = 2,
    enc_max_seq_len = 256,
    enc_heads = 4,
    dec_num_tokens = 20000,
    dec_depth = 2,
    dec_max_seq_len = 256,
    dec_heads = 4,
    ignore_index = pad_tok,
    pad_value = pad_tok
)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
lucidrainscommented, Apr 3, 2020

@nakarinh14 yes, it should as well

1reaction
lucidrainscommented, Apr 3, 2020

@nakarinh14 yes, that should work!

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas concat ignore_index doesn't work - Stack Overflow
I think ignore_index only ignores the labels on the axis you're joining on, so it still does an outer join on the index...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found