dell-research-harvard/AmericanStories
Text ClassificationText GenerationText RetrievalSummarizationQuestion AnsweringENBenchmarkcc-by-4.0
Dell-research-harvard/AmericanStories is a text classification benchmark dataset in EN from dell-research-harvard in Parquet format. It is distributed under the cc-by-4.0 license and falls in the 100M<n<1B size category, and has been downloaded 6.8K times.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About dell-research-harvard/AmericanStories
American Stories offers high-quality structured data from historical newspapers suitable for pre-training large language models to enhance the understanding of historical English and world knowledge. It can also be integrated into external databas...
Details
- Task
- Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100M<n<1B
- Creator
- dell-research-harvard
- Year
- 2023
- License
- cc-by-4.0
- Downloads
- 6824
- Likes
- 169