Classify and extract text 10x better and faster 🦾


➡️  Learn more

HJDataset Dataset

Created by Shen et al. at 2020, the HJDataset Dataset contains over 250,000 layout element annotations of seven types in Japanese documents., in Japanese language. Containing 250,000+ in JSON file format.

Dataset Sources

Here you can download the HJDataset dataset in JSON format.

Download HJDataset dataset JSON files

Fine-tune with HJDataset dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original HJDataset paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.