Open Research Corpus
Text CorporaEnglishBenchmark
The Open Research Corpus dataset is a English text corpora resource from Ammar et al. at 2018 comprising 39 examples.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About Open Research Corpus
Dataset contains over 39 million published research papers in Computer Science, Neuroscience, and Biomedical.
Details
- Task
- Text Corpora
- Language
- English
- Format
- JSON
- Rows / instances
- 39M
- Creator
- Ammar et al.
- Year
- 2018