Skip to content

dell-research-harvard/newswire

Text ClassificationText GenerationText RetrievalSummarizationQuestion AnsweringENBenchmarkcc-by-4.0

Created by dell-research-harvard at 2024, the dell-research-harvard/newswire is a text classification benchmark dataset in EN in Parquet format. With 410 downloads and 91 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 1M<n<10M-scale dataset.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About dell-research-harvard/newswire

Dataset Card for NewsWire Dataset Summary NewsWire contains 2.7 million unique public domain U.S. news wire articles, written between 1878 and 1977. Locations in these articles are georeferenced, topics are tagged using customized ne...

Details

Task
Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering
Language
EN
Format
Parquet
Rows / instances
N/A
Size
1M<n<10M
Creator
dell-research-harvard
Year
2024
License
cc-by-4.0
Downloads
410
Likes
91
Download Homepage

Related Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering datasets

FAQ