Classify and extract text 10x better and faster 🦾


➡️  Learn more

Harvard Library Dataset

Created by Harvard at n/a, the Harvard Library Dataset contains books, journals, electronic resources, manuscripts, archival materials, scores, audio, video and other materials., in English language. Containing 12.7M in MODS, Dublin Core file format.

Dataset Sources

Here you can download the Harvard Library dataset in MODS, Dublin Core format.

Download Harvard Library dataset MODS, Dublin Core files

Fine-tune with Harvard Library dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original Harvard Library paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.