Classify and extract text 10x better and faster 🦾


➡️  Learn more

Reuters-21578 Benchmark Corpus Dataset

Created by Lewis et al. at 1997, the Reuters-21578 Benchmark Corpus Dataset is a collection of 10,788 documents from the Reuters financial newswire service, partitioned into a training set with 7769 documents and a test set with 3019 documents., in English language. Containing 10,788 in TSV file format.

Dataset Sources

Here you can download the Reuters-21578 Benchmark Corpus dataset in TSV format.

Download Reuters-21578 Benchmark Corpus dataset TSV files

Fine-tune with Reuters-21578 Benchmark Corpus dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original Reuters-21578 Benchmark Corpus paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.