Classify and extract text 10x better and faster 🦾


➡️  Learn more

NCLS-Corpora Dataset

Created by Zhu et al. at 2019, the NCLS-Corpora Contains two datasets for cross-lingual summarization: ZH2ENSUM and EN2ZHSUM. There exists 370,759 English-to-Chinese cross-lingual summarization (CLS) pairs from ENSUM and 1,699,713 Chinese-to-English CLS pairs., in Chinese, English language. Containing 2M+ in Text file format.

Dataset Sources

Here you can download the NCLS-Corpora dataset in Text format.

Download NCLS-Corpora dataset Text files

Fine-tune with NCLS-Corpora dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original NCLS-Corpora paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.