Classify and extract text 10x better and faster 🦾


➡️  Learn more

Cornell Newsroom Dataset

Created by Grusky et al. at 2018, the Cornell Newsroom Dataset contains 1.3 million articles and summaries written by authors and editors in the newsrooms of 38 major publications. The summaries are obtained from search and social metadata between 1998 and 2017., in English language. Containing 1.3M in JSON file format.

Dataset Sources

Here you can download the Cornell Newsroom dataset in JSON format.

Download Cornell Newsroom dataset JSON files

Fine-tune with Cornell Newsroom dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original Cornell Newsroom paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.