Deft Dataset
Created by Spala et al. at 2019, the Deft Dataset contains annotated content from two different data sources: 1) 2,443 sentences from various 2017 SEC contract filings from the publicly available US Securities and Exchange Commission EDGAR (SEC) database, and 2) 21,303 sentences from open source textbooks including topics in biology, history, physics, psychology, economics, sociology, and government. , in English language. Containing 23,746 in Text file format.
Dataset Sources
Here you can download the Deft dataset in Text format.
Download Deft dataset Text files
Fine-tune with Deft dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original Deft paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.