Skip to content

allenai/dolma3_longmino_pool

General NLPENodc-by

Allenai/dolma3_longmino_pool is a General NLP dataset in EN from allenai in Parquet format. It is distributed under the odc-by license, and has been downloaded 11.5K times.

About allenai/dolma3_longmino_pool

⚠️ IMPORTANT NOTICE ⚠️ This is the Dolma 3 Longmino pool; it hasn't been mixed. If you are interested in the data used to train: Olmo 3 7B: allenai/dolma3_longmino_mix-50B-1025 Olmo 3 32B: allenai/dolma3_dolmino_mix-100B-1125 Dolm...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
allenai
Year
2025
License
odc-by
Downloads
11521
Likes
13
Download Homepage

Related General NLP datasets

FAQ