Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds MMLU CoT, gsm8k and arc_challenge for llama instruct #2829

Merged
merged 7 commits into from
Mar 30, 2025

Conversation

anmarques
Copy link
Contributor

This PR adds 3 tasks in the style used by Meta for Llama 3 models:

  • mmlu cot
  • arc_challenge
  • gsm8k

The configs are very similar to the other llama3 tasks already supported.

Notes regarding arc_challenge:

  • The original arc_challenge dataset contains 8 samples with less than 4 options. Meta filtered these samples out, and this PR does the same.
  • A small number of samples use 1, 2, 3, 4 as labels. These are replaced by A, B, C, D like the rest in the doc preprocessing

Below are the results obtained with this PR for Llama-3.1-8B-Instruct:

Task This PR reference
MMLU CoT 72.63 73.0
ARC_Challenge 83.09 83.4
GSM8k 85.60 84.5

@baberabb
Copy link
Contributor

This is great! I actually added arc_challenge before but missed your details. No wonder I couldn't reproduce the results.

Just need you to add the tasks in the Readme and should be good to merge. Can you also add your note as well? I think other users will find it quite helpful!

@anmarques
Copy link
Contributor Author

This is great! I actually added arc_challenge before but missed your details. No wonder I couldn't reproduce the results.

Just need you to add the tasks in the Readme and should be good to merge. Can you also add your note as well? I think other users will find it quite helpful!

Thanks @baberabb. I just updated the README. Please let me know if you need anything else.

@baberabb
Copy link
Contributor

Great! LGTM!

@baberabb baberabb merged commit 3816796 into EleutherAI:main Mar 30, 2025
1 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants