Skip to content

CohereLabs/Global-MMLU

General NLPEN, AR, BNBenchmarkapache-2.0

CohereLabs/Global-MMLU is a General NLP benchmark dataset in EN, AR, BN from CohereLabs with 601,734 records in Parquet format. It is distributed under the apache-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 18.1K times.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About CohereLabs/Global-MMLU

Dataset Summary Global-MMLU 🌍 is a multilingual evaluation set spanning 42 languages, including English. This dataset combines machine translations for MMLU questions along with professional translations and crowd-sourced post-edits. It also in...

Details

Task
General NLP
Language
EN, AR, BN
Format
Parquet
Rows / instances
601734
Size
100K<n<1M
Creator
CohereLabs
Year
2024
License
apache-2.0
Downloads
18054
Likes
160
Download Homepage

Related General NLP datasets

FAQ