arcee-ai/The-Tome
General NLPEnglishBenchmarkmit
The arcee-ai/The-Tome dataset is a English General NLP resource from arcee-ai at 2024. With 176 downloads and 107 likes, it is actively used by the community. It is released under the mit license and is a 1M<n<10M-scale dataset.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About arcee-ai/The-Tome
The Tome is a curated dataset designed for training large language models with a focus on instruction following. It was used in the training of our Arcee-Nova/Spark models, which was later merged with Qwen2-72B-Instruct (or 7B with the Spark model...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1M<n<10M
- Creator
- arcee-ai
- Year
- 2024
- License
- mit
- Downloads
- 176
- Likes
- 107