SFU Opinion and Comments Corpus (SOCC) Dataset
Created by Kolhatkar et al. at 2018, the SFU Opinion and Comments Corpus (SOCC) Dataset contains 10,339 opinion articles (editorials, columns, and op-eds) together with their 663,173 comments from 303,665 comment threads, from the main Canadian daily in English, The Globe and Mail, from January 2012 to December 2016. In addition there's a subset annotated corpus measuring toxicity, negation and its scope, and appraisal containing 1,043 annotated comments in responses to 10 different articles covering a variety of subjects: technology, immigration, terrorism, politics, budget, social issues, religion, property, and refugees., in English language. Containing 663,173 in CSV file format.
Dataset Sources
Here you can download the SFU Opinion and Comments Corpus (SOCC) dataset in CSV format.
Download SFU Opinion and Comments Corpus (SOCC) dataset CSV files
Fine-tune with SFU Opinion and Comments Corpus (SOCC) dataset
Metatext is a powerful no-code tool for train, tune and integrate custom NLP models
Paper
Read full original SFU Opinion and Comments Corpus (SOCC) paper.
Classify and extract text 10x better and faster 🦾
Metatext helps you to classify and extract information from text and documents with customized language models with your data and expertise.