question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add detailed documents for beam search implementation

See original GitHub issue

The SequenceGenerator class is so hard to understand, can someone provide a detailed document? e.g.

        # get the top beam_size active hypotheses, which are just the hypos
        # with the smallest values in active_mask
        active_hypos, _ignore = buffer('active_hypos'), buffer('_ignore')  # [b, k]
        torch.topk(
            active_mask, k=beam_size, dim=1, largest=False,
            out=(_ignore, active_hypos)
        )

        active_bbsz_idx = buffer('active_bbsz_idx')
        torch.gather(
            cand_bbsz_idx, dim=1, index=active_hypos,
            out=active_bbsz_idx,
        )
        active_scores = torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores[:, step].view(bsz, beam_size),
        )

        active_bbsz_idx = active_bbsz_idx.view(-1)
        active_scores = active_scores.view(-1)

        # copy tokens and scores for active hypotheses
        torch.index_select(
            tokens[:, :step + 1], dim=0, index=active_bbsz_idx,
            out=tokens_buf[:, :step + 1],
        )
        torch.gather(
            cand_indices, dim=1, index=active_hypos,
            out=tokens_buf.view(bsz, beam_size, -1)[:, :, step + 1],
        )
        if step > 0:
            torch.index_select(
                scores[:, :step], dim=0, index=active_bbsz_idx,
                out=scores_buf[:, :step],
            )
        torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores_buf.view(bsz, beam_size, -1)[:, :, step],`

The above vectorized code makes me almost crazy. I know it helps speeding computation, but at the cost of understanding.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:6
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
myleottcommented, May 30, 2019

@villmow, take a look here: https://github.com/pytorch/fairseq/commit/20bbbdc068dabef1a3a4689dd5a9ccffcc8297c5

It should be a drop-in replacement for SequenceGenerator. It’s still batched, but removes a lot of the other complexity. Happy to take a PR if you want to take a stab at integrating this more cleanly or simplifying it further.

1reaction
myleottcommented, Apr 23, 2019

Someone wrote a really nice tutorial about the beam search implementation in fairseq: http://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/

We have a slightly simpler beam search implementation, but we’d like to simplify it even further (by removing all batching) before releasing it. In any case this will be much slower than the current vectorized implementation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

add detailed documents for beam search implementation #535
The SequenceGenerator class is so hard to understand, can someone provide a detailed document? e.g.. # get the top beam_size active hypotheses, ...
Read more >
How to Implement a Beam Search Decoder for Natural ...
In this tutorial, you will discover the greedy search and beam search decoding algorithms that can be used on text generation problems. After ......
Read more >
Implementing Beam Search - Part 1
I'd like to start with its Beam Search implementation. It is widely used in seq2seq models, but I haven't yet had a good...
Read more >
Guiding Text Generation with Constrained Beam Search in ...
This blog post assumes that the reader is familiar with text generation methods using the different variants of beam search, as explained in ......
Read more >
Word Beam Search: A CTC Decoding Algorithm
Score by LM each time a word is fully recognized (N). Suppose we have a beam “My nam”. As soon as an “e”...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found