question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GLUE test set predictions

See original GitHub issue

🚀 Feature request

Motivation

The run_glue script is super helpful. But it currently doesn’t implement producing predictions on the test datasets for the GLUE tasks. I think this would be extremely helpful for a lot of people. I’m sure there are plenty of people who have implemented this functionality themselves, but I haven’t found any. Since transformers already provides train and dev for GLUE, it would be cool to complete the feature set with providing test set predictions.

Your contribution

I’m personally working on a branch that extends the glue_processors to support the test sets (which are already downloaded by the recommended download_glue.py script. I also update the run_glue.py script to produce the *.tsv files required by the GLUE online submission interface.

I think I’m a couple days out from testing/completing my implementation. I’m also sure plenty of implementations exist of this. If there are no other plans to support this in the works, I’m happy to submit a PR.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
julien-ccommented, May 25, 2020

@AMChierici make sure you run from master, there’s indeed a mode kwarg now.

@shoarora Thanks for this first PR and I did check yours while merging the other (to make sure that the indices in csv parsing, etc. were correct)

1reaction
shoaroracommented, May 25, 2020

@AMChierici I didn’t author #4463, which is what has made it to master to enable this feature. I haven’t played with it yet so sorry I can’t be of more help

Read more comments on GitHub >

github_iconTop Results From Across the Web

GLUE Benchmark
A public leaderboard for tracking performance on the benchmark and a dashboard for visualizing the performance of models on the diagnostic set. The...
Read more >
glue · Datasets at Hugging Face
premise (string) label (class label) idx (int32) "The cat sat on the mat." ‑1 0 "The cat did not sit on the mat." ‑1 1 "The...
Read more >
GLUE Explained: Understanding BERT Through Benchmarks
The General Language Understanding Evaluation benchmark (GLUE) is a collection of datasets used for training, evaluating, and analyzing NLP ...
Read more >
Step 6: Evaluate the Model - Amazon SageMaker
Set up the following function to predict each line of the test set. · Run the following code to make predictions of the...
Read more >
Experimental results on the test set from GLUE server. The ...
... report the experimental results on the dev set of GLUE in Table 2 and submit our predictions to the GLUE test server...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found