Reuters-21578 Benchmark Corpus Dataset
Created by Lewis et al. at 1997, the Reuters-21578 Benchmark Corpus Dataset is a collection of 10,788 documents from the Reuters financial newswire service, partitioned into a training set with 7769 documents and a test set with 3019 documents., in English language. Containing 10,788 in TSV file format.
Dataset Sources
Here you can download the Reuters-21578 Benchmark Corpus dataset in TSV format.
Download Reuters-21578 Benchmark Corpus dataset TSV files
Fine-tune with Reuters-21578 Benchmark Corpus dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original Reuters-21578 Benchmark Corpus paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.