Skip to content

allenai/dolma3_dolmino_mix-100B-1025

Text GenerationENodc-by

Allenai/dolma3_dolmino_mix-100B-1025 is a text generation dataset in EN from allenai in Parquet format. It is distributed under the odc-by license and falls in the 10M<n<100M size category, and has been downloaded 24.6K times.

About allenai/dolma3_dolmino_mix-100B-1025

Dolma 3 Dolmino Mix (100B) The Dolma 3 Dolmino Mix (100B) is the mixture of high-quality data used for the second stage of training for Olmo 3 7B model. Dataset Sources Source Category Tokens Documents TinyMATH Mind Math ...

Details

Task
Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
allenai
Year
2025
License
odc-by
Downloads
24582
Likes
10
Download Homepage

Related Text Generation datasets

FAQ