Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Additional input features for decoder

See original GitHub issue

I’m trying to implement a translation decoder (a subclass ofFairseqIncrementalDecoder) that can take additional input features (e.g., language id, tags, etc.) in addition to input token. I am doing it by including it as a keyword argument in forward, it looks like this:

def forward(self, prev_output_tokens, encoder_out, segments=None, incremental_state=None):

where segments is the extra input I mentioned.

I also created a custom dataset and translation task to add the segment info into the batch. It works fine with fairseq-train, however I can’t use the model for inference with fairseq-generate since SequenceGenerator directly calls the decoder with the default input (and without **kwargs, https://github.com/pytorch/fairseq/blob/master/fairseq/sequence_generator.py#L608). Could you recommend a good approach for doing this? Thanks.

Issue Analytics

State:
Created 4 years ago
Reactions:2
Comments:7 (2 by maintainers)

Top GitHub Comments

1reaction

raymondhscommented, Jun 27, 2019

Hello, what I did was to subclass an existing/suitable FairseqDataset and FairseqTask in fairseq, then modify relevant methods accordingly. Like this: https://github.com/raymondhs/fairseq-laser/blob/master/laser/laser_dataset.py#L19-L37 Here I modified the collater function to include a target language embedding as input to the decoder. (I think it’s a little hacky though…)

0reactions

dearchillcommented, Nov 29, 2021

Hi @raymondhs , i have similar idea with using additional input features of word such as pos tag, named entity, etc… and concate all of them to single input to encoder. Can you give me some ideas about how to custom dataset and task to implement it ?

Hi, I have a similar purpose just like yours. Did you just modify the dataset and add additional features? Or maybe there are more easy ways to do this now? @myleott I found the mask_fill api is mostly concentrated on decoder.

Yeah, I end up with solution that modify the dataset. I try to create word embedding for every single features, then concate word embedding of features together and use it as dataset. You can change their features embedding concatenation order. You can find more detail about this idea based on paper: Improving Neural Translation Models with Linguistic Factors

Thanks for your kind reply! I’ll figure out how to do this.

Top Results From Across the Web

Additional input features for decoder · Issue #807 - GitHub

I'm trying to implement a translation decoder (a subclass of FairseqIncrementalDecoder ) that can take additional input features (e.g., ...

Decoders and Multiplexers

A standard decoder typically has an additional input called Enable. ... Output is only generated when the Enable input has value 1; otherwise,...

Encoder Decoder Models - Hugging Face

EncoderDecoderModel is a generic model class that will be instantiated as a transformer architecture with one of the base model classes of the...

Transformer's Encoder-Decoder: Let's Understand The Model ...

The decoder uses input features from the encoder to generate an output sentence. The input features are nothing but enriched embedding vectors.

Transformers Explained Visually (Part 3): Multi-head Attention ...

Self-attention in the Encoder — the input sequence pays attention to itself · Self-attention in the Decoder — the target sequence pays attention ......