Europeana Newspapers Dataset

Classify and extract text 10x better and faster 🦾

Created by Neudecker at 2016, the Europeana Newspapers Named Entity Recognition corpora for Dutch, French, German languages from Europeana Newspapers. Data is encoded in the IOB format., in Multi-Lingual language. Containing 486,218 in BIO file format.

Dataset Sources

Here you can download the Europeana Newspapers dataset in BIO format.

Download Europeana Newspapers dataset BIO files

Fine-tune with Europeana Newspapers dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️ Learn more

Paper

Read full original Europeana Newspapers paper.

Download PDF paper

Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.

Book a demo