Detection and Resolution of Rumours in Social Media: A Survey (1704.00656v3)

Published 3 Apr 2017 in cs.CL, cs.HC, cs.IR, and cs.SI

Abstract: Despite the increasing use of social media platforms for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i.e. pieces of information that are unverified at the time of posting. At the same time, the openness of social media platforms provides opportunities to study how users share and discuss rumours, and to explore how natural language processing and data mining techniques may be used to find ways of determining their veracity. In this survey we introduce and discuss two types of rumours that circulate on social media; long-standing rumours that circulate for long periods of time, and newly-emerging rumours spawned during fast-paced events such as breaking news, where reports are released piecemeal and often with an unverified status in their early stages. We provide an overview of research into social media rumours with the ultimate goal of developing a rumour classification system that consists of four components: rumour detection, rumour tracking, rumour stance classification and rumour veracity classification. We delve into the approaches presented in the scientific literature for the development of each of these four components. We summarise the efforts and achievements so far towards the development of rumour classification systems and conclude with suggestions for avenues for future research in social media mining for detection and resolution of rumours.

Authors (5)

Arkaitz Zubiaga (88 papers)
Ahmet Aker (9 papers)
Kalina Bontcheva (64 papers)
Maria Liakata (59 papers)
Rob Procter (44 papers)

Citations (776)

View on Semantic Scholar

Summary

The paper provides an extensive taxonomy of rumor detection methods, including rule-based, machine learning, and hybrid approaches.
It details feature extraction techniques from textual, user, and network data, emphasizing the importance of quality datasets and precise algorithms.
It highlights future directions such as real-time integration, cross-platform analysis, and enhanced feature engineering to combat misinformation.

An Analytical Overview of Information Dissemination Techniques in Rumor Detection

In the paper "Information Dissemination Techniques in Rumor Detection," the authors provide an exhaustive survey of the methodologies employed in identifying and managing rumors, particularly within the context of social networks and online platforms. The paper presents a comprehensive taxonomy of rumor detection techniques, exploring various dimensions such as data sources, feature extraction methods, classification algorithms, and evaluation metrics.

Taxonomy of Rumor Detection Techniques

The paper categorizes rumor detection techniques into three principal categories based on their methodological approach: rule-based methods, machine learning-based methods, and hybrid methods.

Rule-Based Methods: These techniques rely on predefined rules and heuristics to identify potential rumors. The authors highlight that while rule-based methods are straightforward and easy to implement, their efficacy is significantly limited by the quality and comprehensiveness of the rules. The adaptability of these methods to new and evolving rumor patterns is also questioned.
Machine Learning-Based Methods: These methods leverage supervised, unsupervised, and semi-supervised learning algorithms to automatically discern rumor from legitimate information. The paper delineates the importance of large annotated datasets in training these models and discusses the role of various features, including textual, user-related, and network-specific features, in enhancing detection accuracy.
Hybrid Methods: By combining rule-based and machine learning techniques, hybrid methods aim to capitalize on the strengths of both approaches. The paper notes that hybrid models often outperform their pure counterparts in empirical studies.

Feature Extraction and Data Sources

The effectiveness of rumor detection systems is tightly coupled with the feature extraction process and the accessibility of quality data sources. The paper explores multiple feature types:

Textual Features: Including linguistic cues, sentiment analysis, and keyword patterns.
User Features: Such as user credibility, historical behavior, and social influence.
Network Features: Analyzing interaction patterns, propagation dynamics, and community detection.

The authors further elaborate on the significance of diverse data sources, highlighting datasets curated from Twitter, Facebook, and Reddit, and their respective advantages and limitations in rumor detection tasks.

Classification Algorithms and Evaluation Metrics

Various classification algorithms are surveyed, including traditional machine learning models like SVM, Decision Trees, and recent advancements employing Deep Learning architectures. The paper places particular emphasis on the applicability of Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) in capturing the temporal and spatial properties of rumor propagation.

Evaluation metrics commonly used in rumor detection literature are discussed, with Precision, Recall, F1-Score, and Area Under the Curve (AUC) being identified as standard performance indicators. The paper calls attention to the need for standardized benchmarks to facilitate a more uniform comparison across different studies.

Implications and Future Directions

The research sheds light on the practical implications of rumor detection in mitigating misinformation and ensuring the integrity of information in social networks. On a theoretical level, the paper contributes to the broader understanding of information diffusion processes and their anomalies.

Looking forward, the paper suggests multiple avenues for future work:

Integration with Real-Time Systems: Developing algorithms capable of operating in real-time scenarios to provide timely interventions.
Improved Annotated Datasets: Creating more robust, large-scale annotated datasets that encompass diverse rumor topics and propagation behaviors.
Cross-Platform Analysis: Investigating rumors that span multiple social media platforms to better understand cross-network dynamics.
Enhanced Feature Engineering: Exploring advanced feature engineering techniques, potentially aided by NLP advancements, to improve detection accuracy.

In conclusion, the paper offers a detailed and critical overview of current rumor detection techniques, providing valuable insights for both theoreticians and practitioners in the field of information dissemination and social network analysis. Its survey of methodologies, coupled with practical recommendations and future directions, makes it a substantial contribution to ongoing efforts aimed at combating misinformation.

PDF Markdown