question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New Evaluation: Math

See original GitHub issue

We would like to find or create an eval dataset that tests mathematical knowledge. The GRE exams may be a good source of material.

  • Data processing code implemented
  • Evaluation implemented

The evaluation code should be modeled after the interface in lm_eval/base.py and the example of the BoolQ task in lm_eval/tasks/suerglue.py

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
hendryckscommented, Jan 11, 2021

The DeepMind mathematics dataset is in the pile.

The GRE exams may be a good source of material.

Some GRE exams (biology, chemistry, maths, physics, cs) are covered in https://arxiv.org/pdf/2009.03300.pdf

Here’s a competition maths dataset: https://drive.google.com/file/d/1RafScUe8O6MxE4K3COnOT4CFR4zvQdXv/view?usp=sharing I’ll have an arXiv paper for it in a month.

0reactions
StellaAthenacommented, Jan 12, 2021

OK, though it might be an undergraduate collaborator of mine to add the code.

Understood. Thanks again!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Math in Common Evaluation
WestEd's frequent formative evaluation reports on trends and challenges in the ... Grounded in the latest research on mathematics education and education ...
Read more >
Mathematics | NAEP - National Center for Education Statistics
Explore the NAEP mathematics assessment, sample questions that measure knowledge and problem-solving skills, and academic achievement ...
Read more >
Evaluation Modes for Math Questions
Algebraic questions evaluate your students' responses algebraically using Mathematica, in much the same way you would solve an equation. Note. You must know...
Read more >
Educator Guide to the 2022 Grades 3–8 Mathematics Tests
Clusters, Standards, and Sequencing in Instruction and Assessment. The 2022 Grades 3–8 Mathematics Tests will measure the New York State Learning Standards ...
Read more >
5 Steps to Evaluating Supplemental Math Curriculum ...
Follow these five steps to strategically evaluate supplemental curriculum ... for all learners: Nearpod Math – A new way to experience Math.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found