question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A FAIRSEQ text summarization example: the abstractive approach with a Levenshtein transformer

See original GitHub issue

The Levenshtein transformer paper reports 0.75+ improvements in ROUGE-L in the abstractive text summarization task on Gigaword over the baseline transformer.

benchmarks

The team of @fvadzim, @whiteRa2bit, @NickShatalov and I would love to reproduce the result as part of the intensive practicum organized by Yandex (here is the description in Russian) and continue working on the PR after the event ends on November 16, trying the model out on the Russian news dataset and contributing the docs that explain the training procedure to FAIRSEQ.

Proposal

Here is the plan of what we would love to contribute:

  1. Creating a new page on text summarization in examples

    The first sentence in README mentions summarization amongst others, but there is no complete description of how to achieve this, despite the fact that both the Levenshtein transformer implementation and pay_less_attention_paper seem to have almost all of the necessary code to make it work.

  2. Making a new task for training the Levenshtein transformer for abstractive text summarization

    The end goal would be to train the model on both English and Russian datasets.

Questions

  1. Could you please tell me whether there are any apparent roadblocks in the code itself you can see already that can prevent this plan from succeeding?

  2. The paper uses a Transformer base as a teacher to obtain ROUGE-L of 33.81. The current implementation of NAT NMT also takes the teacher instead of oracle approach as well, so this should help us in setting the training up. Another training scheme that @justheuristic has mentioned in private communication is the one similar to the NMT refinement method introduced by @lena-voita, @rsennrich and @anvdev in this paper: the idea is to produce an extractive summary first, and then refine it with Leveshtein. Have you tested this idea? Sounds nice to include this variation in the comparison as well.

  3. Seems that the current implementation is under active development at the moment, given a number of issues on SIGSEV in the multi-GPU environment:

    Are there any precautions on which commit of the repo to use in order to avoid these issues? Is the fix/major update coming soon?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
aptlincommented, Nov 11, 2019

The v2 of the paper is out. The base transformer performs better than expected, but Levenshtein still beats the base in speed, and provides comparable results for summarization:

IMAGE 2019-11-11 11:23:37

0reactions
aptlincommented, Jun 25, 2020

No, sorry, I do not have the bandwidth now to brush up our results, but you can take a look here for the training scripts and here for fairseq with the comet.ml support.

Read more comments on GitHub >

github_iconTop Results From Across the Web

A FAIRSEQ text summarization example: the abstractive ...
The Levenshtein transformer paper reports 0.75+ improvements in ROUGE-L in the abstractive text summarization task on Gigaword over the ...
Read more >
Text Summarization | Papers With Code
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling ...
Read more >
Levenshtein Transformer - DeepAI
This model achieves comparable or better results than a strong Transformer baseline in both machine translation and text summarization, ...
Read more >
Levenshtein Transformer - arXiv
We propose Levenshtein Transformer (LevT), a new sequence generation model composed of the insertion and deletion operations. This model ...
Read more >
Abstractive Text Summarization Using Transformers - Medium
This is a more human-like way of generating summaries and these summaries are more effective as compared to the extractive approaches. However, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found