Commit 44017c4

authored

Add new Arabic benchmarks (5) and enhance existing tasks (#372)

* Update arabic_evals.py Add new Arabic benchmarks and update existing tasks - Renamed `arabic_mmlu` to `arabic_mmlu_mt` to highlight its machine-translated origin. - Added new benchmarks: `arabic_mmlu` ArabicMMLU (https://arxiv.org/abs/2402.12840), `arabic_mmlu_ht` (human-translated), and `MadinahQA` from MBZUAI. As well as `arabic_mmmlu` (OpenAI MMMLU), and `AraTrust` a trustworthiness benchmark for Arabic LLMs (https://arxiv.org/abs/2403.09017). - Enhanced prompt functions for better flexibility in answer options. * Update and rename OALL_tasks.txt to OALL_v1_tasks.txt Rename file to refelect that it is v1 leaderboard tasks * Create OALL_v2_tasks.txt Tasks for v2 of OALL * Update all_arabic_tasks.txt add new and renamed tasks * Update arabic_evals.py Fix formatting issues for * Update all_arabic_tasks.txt Add missing task: OpenAI's MMMLU arabic subset * Update all_arabic_tasks.txt Correct order * Update arabic_evals.py remove openai mmmlu task following the discussion here: #372 * Update all_arabic_tasks.txt remove openai mmmlu task following the discussion here: #372 * Update tasks.py Adding a templated version of arabic mmlu based on @hynky1999 request in the #372 PR * Update tasks.py remove arabic_mmlu_templated_tasks --------- Co-authored-by: Clémentine Fourrier <[email protected]> Co-authored-by: Nathan Habib <[email protected]>

1 parent 8a66339 commit 44017c4Copy full SHA for 44017c4

6 files changed

+817

-306

lines changed

community_tasks
- arabic_evals.py
examples/tasks
src/lighteval/tasks/multilingual
- tasks.py

6 files changed

+817

-306

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 44017c4

6 files changed

6 files changed

File tree

6 files changed

6 files changed

0 commit comments