Hate Speech Dataset from a White Supremacy Forum (1809.04444v1)

Published 12 Sep 2018 in cs.CL

Abstract: Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic. Due to the massive rise of user-generated web content on social media, the amount of hate speech is also steadily increasing. Over the past years, interest in online hate speech detection and, particularly, the automation of this task has continuously grown, along with the societal impact of the phenomenon. This paper describes a hate speech dataset composed of thousands of sentences manually labelled as containing hate speech or not. The sentences have been extracted from Stormfront, a white supremacist forum. A custom annotation tool has been developed to carry out the manual labelling task which, among other things, allows the annotators to choose whether to read the context of a sentence before labelling it. The paper also provides a thoughtful qualitative and quantitative study of the resulting dataset and several baseline experiments with different classification models. The dataset is publicly available.

Citations (402)

View on Semantic Scholar

Summary

The paper introduces a detailed hate speech dataset with 10,568 sentences annotated at the sentence level from a white supremacy forum.
It employs rigorous annotation guidelines and achieves satisfactory inter-annotator agreement with Cohen’s kappa values above 0.61.
Baseline experiments using SVM, CNN, and LSTM (with LSTM reaching 0.78 accuracy) underscore the dataset’s potential for enhancing hate speech detection.

Hate Speech Dataset from a White Supremacy Forum: An Analytical Overview

The paper presented offers a detailed paper of hate speech through the development and analysis of a dataset sourced from Stormfront, a white supremacist internet forum. This dataset is unique in its approach to annotating hate speech at the sentence level, rather than at a broader textual level, providing a granular view of language use in hate-promoting contexts.

Dataset Creation and Characteristics

The dataset comprises 10,568 sentences, each meticulously labeled as conveying hate speech or not. The selection and preparation process for these sentences involved scraping the forum content and segmenting it into individual sentence units. The dataset adheres to a rigorous annotation guideline that defines hate speech through three premises: it must be a deliberate attack, directed towards a group, and motivated by identity-related aspects.

An innovative aspect of the dataset is the manual annotation of sentences that require contextual interpretation, labeled as ‘relation’. This takes into account those statements which in isolation might not seem hateful but are recontextualized as such in their intended sequence. Also notable is the logistical annotation procedure that allows for deeper insights into conversations by considering the forum's structural contexts like sub-forums and user threads.

Annotation Process and Reliability

Annotation reliability was scrutinized through inter-annotator agreement metrics, achieving Cohen's kappa values of 0.614 and 0.627 in initial annotation batches, reflecting a satisfactory level of agreement which aligns with benchmarks in related literature. The process was iterative, improving over time as annotators became more accustomed to the nuanced guidelines.

Baseline Experimental Validation

To assess the practical applicability of the dataset, the authors conducted experiments utilizing standard classification approaches including Support Vector Machines, Convolutional Neural Networks, and LSTMs. A key insight is that despite the straightforward models used, results showed promise, signifying the potential for more complex models to build upon these baselines. Notably, the LSTM achieved the highest overall accuracy of 0.78 on context-independent cases.

Implications and Future Directions

The practical utility of this dataset is considerable, offering fertile ground for developing more sophisticated hate speech detection systems that can benefit from its detailed and structured annotations. The theoretically relevant insight here is the subjectivity inherent in defining and interpreting hate speech, a challenge that underscores the need for comprehensive contextual understanding in computational approaches.

Future developments could see expanded inclusion of extralinguistic context and world knowledge to bolster classifier robustness. Additionally, active learning paradigms may optimize future annotation endeavors, addressing the unbalanced nature of hate versus non-hate instances in the current dataset.

Overall, this work provides a significant contribution to the field of computational social science and hate speech detection, offering a thorough, structured resource poised for methodological advancements in analyzing and understanding hate speech in digital environments.

PDF Markdown