Question 1

What is the SFU Opinion and Comments Corpus (SOCC) dataset?

Accepted Answer

Dataset contains 10,339 opinion articles (editorials, columns, and op-eds) together with their 663,173 comments from 303,665 comment threads, from the main Canadian daily in English, The Globe and Mail, from January 2012 to December 2016.…

Question 2

Is SFU Opinion and Comments Corpus (SOCC) a benchmark?

Accepted Answer

SFU Opinion and Comments Corpus (SOCC) is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download SFU Opinion and Comments Corpus (SOCC)?

Accepted Answer

SFU Opinion and Comments Corpus (SOCC) is available at its source: https://github.com/sfu-discourse-lab/SOCC.

SFU Opinion and Comments Corpus (SOCC)

About SFU Opinion and Comments Corpus (SOCC)

Details

Related Text Corpora, Text Classification datasets

FAQ