Classify and extract text 10x better and faster 🦾


➡️  Learn more

DiscoFuse Dataset

Created by Geva et al. at 2019, the DiscoFuse Dataset contains examples for training sentence fusion models. Sentence fusion is the task of joining several independent sentences into a single coherent text. The data has been collected from Wikipedia and from Sports articles., in English language. Containing ~60M in TSV file format.

Dataset Sources

Here you can download the DiscoFuse dataset in TSV format.

Download DiscoFuse dataset TSV files

Fine-tune with DiscoFuse dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original DiscoFuse paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.