Skip to content

dell-research-harvard/AmericanStories

Text ClassificationText GenerationText RetrievalSummarizationQuestion AnsweringENBenchmarkcc-by-4.0

Dell-research-harvard/AmericanStories is a text classification benchmark dataset in EN from dell-research-harvard in Parquet format. It is distributed under the cc-by-4.0 license and falls in the 100M<n<1B size category, and has been downloaded 6.8K times.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About dell-research-harvard/AmericanStories

American Stories offers high-quality structured data from historical newspapers suitable for pre-training large language models to enhance the understanding of historical English and world knowledge. It can also be integrated into external databas...

Details

Task
Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering
Language
EN
Format
Parquet
Rows / instances
N/A
Size
100M<n<1B
Creator
dell-research-harvard
Year
2023
License
cc-by-4.0
Downloads
6824
Likes
169
Download Homepage

Related Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering datasets

FAQ