Classify and extract text 10x better and faster 🦾


➡️  Learn more

Aesthetics Text Corpus Dataset

Created by Venugopal et al. at 2019, the Aesthetics Text Corpus Dataset consists of novels and short stories written in Hindi language. Novels and stories were scraped from http://hindisamay.com, http://premchand.co.in, a website dedicated to the popular novelist Premchand’s stories, and Bhandarkar Oriental Research Institute’s Digital Library (http://borilib.com). As a preprocessing step, the text was split into sentences and special characters, English tokens and Latin numbers were deleted., in Hindi language. Containing 978 in Text file format.

Dataset Sources

Here you can download the Aesthetics Text Corpus dataset in Text format.

Download Aesthetics Text Corpus dataset Text files

Fine-tune with Aesthetics Text Corpus dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original Aesthetics Text Corpus paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.