Skip to content

Portuguese Datasets

We catalog 15 Portuguese datasets for NLP and machine learning. Browse the list below or narrow down by task.

This page covers Portuguese, an official language across Brazil, Portugal, and several African nations with growing NLP coverage. Our directory includes 15 datasets in Portuguese.

Updated June 2026

What tasks do Portuguese datasets cover?

Datasets in other languages

Frequently asked questions