Cornell Newsroom Dataset
Created by Grusky et al. at 2018, the Cornell Newsroom Dataset contains 1.3 million articles and summaries written by authors and editors in the newsrooms of 38 major publications. The summaries are obtained from search and social metadata between 1998 and 2017., in English language. Containing 1.3M in JSON file format.
Dataset Sources
Here you can download the Cornell Newsroom dataset in JSON format.
Download Cornell Newsroom dataset JSON files
Fine-tune with Cornell Newsroom dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original Cornell Newsroom paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.