NCLS-Corpora Dataset
Created by Zhu et al. at 2019, the NCLS-Corpora Contains two datasets for cross-lingual summarization: ZH2ENSUM and EN2ZHSUM. There exists 370,759 English-to-Chinese cross-lingual summarization (CLS) pairs from ENSUM and 1,699,713 Chinese-to-English CLS pairs., in Chinese, English language. Containing 2M+ in Text file format.
Dataset Sources
Here you can download the NCLS-Corpora dataset in Text format.
Download NCLS-Corpora dataset Text files
Fine-tune with NCLS-Corpora dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original NCLS-Corpora paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.