Classify and extract text 10x better and faster 🦾


➡️  Learn more

The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) Dataset

Created by Li Yudong et al. at 2020, the The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) Dataset is taken from the abstracts of Chinese papers and their keywords. The papers are selected from some core journals of Chinese social sciences and natural sciences. Use tf-idf to generate a mixture of fake keywords and real keywords in the paper to construct abstract-keyword pairs. The task goal is to judge whether the keywords are all real keywords based on the abstract., in Chinese language. Containing n/a in JSON file format.

Dataset Sources

Here you can download the The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) dataset in JSON format.

Download The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) dataset JSON files

Fine-tune with The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) dataset

Metatext is a powerful no-code tool for train, tune and integrate custom NLP models

➡️  Learn more

Paper

Read full original The Chinese Science and Technology Literature Data Set (CSL) (CLUE Benchmark) paper.

Download PDF paper


Classify and extract text 10x better and faster 🦾

Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.