Classify and extract text 10x better and faster 🦾


➡️  Learn more

KALIMAT Multipurpose Arabic Corpus Dataset

Created by El-Haj et al. at 2013, the KALIMAT Multipurpose Arabic Corpus Dataset contains 20,291 Arabic articles collected from the Omani newspaper Alwatan. Extractive Single-document and multi-document system summaries. Named Entity Recognised articles. The data has 6 categories: culture, economy, local-news, international-news, religion, and sports., in Arabic language. Containing 20,291 in Text file format.

Dataset Sources

Here you can download the KALIMAT Multipurpose Arabic Corpus dataset in Text format.

Download KALIMAT Multipurpose Arabic Corpus dataset Text files

Fine-tune with KALIMAT Multipurpose Arabic Corpus dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original KALIMAT Multipurpose Arabic Corpus paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.