ACL Anthology Reference Corpus (ACL ARC)
Text CorporaEnglishBenchmark
ACL Anthology Reference Corpus (ACL ARC) is a text corpora benchmark dataset in English from Lahiri et al. with 10,921 records in Text format.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About ACL Anthology Reference Corpus (ACL ARC)
Dataset contains 10,921 articles from the February 2007 snapshot of the Anthology; text and metadata for the articles were extracted, consisting of BibTeX records derived either from the headers of each paper or from metadata taken from the Anthology website.
Details
- Task
- Text Corpora
- Language
- English
- Format
- Text
- Rows / instances
- 10,921
- Creator
- Lahiri et al.
- Year
- 2014