Skip to content

Feature Extraction Datasets

There are 6 feature extraction datasets in our directory. Each links to its source, paper, and download — browse the full list below or filter by language.

Feature Extraction is the task of turning text into dense numerical embeddings for downstream search, clustering, or retrieval. We catalog 6 datasets for it.

Updated June 2026

What languages do feature extraction datasets cover?

Explore other dataset tasks

Frequently asked questions