Pegasus: replication and distillation results
See original GitHub issueReplication
mixed & stochastic column of this table
dataset | Authors | This Repo | best bart | best bart name |
---|---|---|---|---|
xsum | 47.60/24.83/39.64 | 46.87/24.46/39.15 | 22.32/37.39 | distilbart-xsum-12-6 |
cnn_dailymail | 44.16/21.56/41.30 | see comment | 21.26/30.59 | distilbart-cnn-12-6 |
newsroom | 45.07/33.39/41.28 | 41.03/29.83/36.96 | ||
multi_news | 47.65/18.75/24.95 | 47.58/19.0/24.77 | ||
gigaword | 39.65/20.47/36.76 | 39.79/20.56/36.80 | ||
wikihow | 46.39/22.12/38.41 * | 46.85/23.64/28.73 | ||
reddit_tifu | 27.99/9.81/22.94 | 32.75/11.68/24.97 | ||
big_patent | 52.29/33.08/41.66 * | |||
arxiv | 44.21/16.95/25.67 | 44.83/17.34/25.60 | ||
pubmed | 45.97/20.15/28.25 | 45.40/19.42/26.93 | ||
aeslc | 37.68/21.25/36.51 | 37.09/21.40/35.93 | ||
billsum | 59.67/41.58/47.59 | 56.18/39.94/45.39 |
- (* (authors footnote)) the numbers of wikihow and big_patent datasets are not comparable because of change in tokenization and data
Final Update (2020-10-16)
Mission accomplished thanks to the work of @patil-suraj, and @stas00 !
The above table now shows that our results are close enough.
We suspect differences are due to treatment of the <n>
character that pegasus generates and slightly different beam search implementations.
Link to Spreadsheet with timing data
Questions about specific results should be asked on the forums/separate issues with @stas00, @patil-suraj, and @sshleifer tagged.
Issue Analytics
- State:
- Created 3 years ago
- Comments:21 (13 by maintainers)
Top Results From Across the Web
Pegasus - Hugging Face
Full replication results and correctly pre-processed data can be found in this Issue. Distilled checkpoints are described in this paper.
Read more >Pegasus: Tolerating Skewed Workloads in Distributed Storage ...
Pegasus uses selective replication of the most popular objects in the data store to distribute load. Using a novel in-network coherence directory, the...
Read more >2.2: Distillation - Chemistry LibreTexts
The distillation result is poor: the fractions obtained are not of acceptable purity. Typical problems: Distillation too fast. The components ...
Read more >(PDF) Ex-MATE: Data Intensive Computing with Large Reduction ...
Our results on a cluster with 128 cores show that for all three applications, our system outperforms PEGASUS, by factors ranging between 9...
Read more >MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited ...
Empirical studies show that MobileBERT is 4.3x smaller and 5.5x faster than BERT_BASE while achieving competitive results on well-known benchmarks.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Updated the table in the Issue description with most recent results after the
calculate_rouge_fix
Moving forward, questions about specific results should be asked on the forums or in a separate issue with @stas00, @patil-suraj, and @sshleifer tagged.Hi Sam, I have a quick question regarding to obtain the results for Gigaword using checkpoint “google/pegasus-gigaword” provided by Google. Currently, I followed a very simple setup using “google/pegasus-gigaword” and follow directly from huggingface default codes in generating gigaword summary. For dataset, I directly load ‘gigaword’ from datasets library without pre-processing. I currently use rouge_score library to compute the rouge score. However, my results evaluating on 1951 test samples in Gigaword deviates almost 10 rouge points (rouge1, rouge2, rougel: 28, 12 and 25 vs 39.79/20.56/36.80). Is it OK if you can share your setup in reproducing your experiment.
Thanks in advance!