TweetCred: Real-Time Credibility Assessment of Content on Twitter (1405.5490v2)

Published 21 May 2014 in cs.CR, cs.SI, and physics.soc-ph

Abstract: During sudden onset crisis events, the presence of spam, rumors and fake content on Twitter reduces the value of information contained on its messages (or "tweets"). A possible solution to this problem is to use machine learning to automatically evaluate the credibility of a tweet, i.e. whether a person would deem the tweet believable or trustworthy. This has been often framed and studied as a supervised classification problem in an off-line (post-hoc) setting. In this paper, we present a semi-supervised ranking model for scoring tweets according to their credibility. This model is used in TweetCred, a real-time system that assigns a credibility score to tweets in a user's timeline. TweetCred, available as a browser plug-in, was installed and used by 1,127 Twitter users within a span of three months. During this period, the credibility score for about 5.4 million tweets was computed, allowing us to evaluate TweetCred in terms of response time, effectiveness and usability. To the best of our knowledge, this is the first research work to develop a real-time system for credibility on Twitter, and to evaluate it on a user base of this size.

Citations (380)

View on Semantic Scholar

Summary

The paper presents TweetCred, a system that employs a semi-supervised SVM-rank model with 45 features for real-time credibility assessment on Twitter.
It achieves efficient performance by computing 80% of credibility scores in under six seconds and processing over 5 million tweets.
User feedback indicates a conservative bias in the assessments, suggesting potential improvements through personalized trust models.

TweetCred: Real-Time Credibility Assessment of Content on Twitter

The paper "TweetCred: Real-Time Credibility Assessment of Content on Twitter" delineates an innovative system designed to address the pervasive issue of misinformation on social media platforms, specifically Twitter. Given the increasing importance of real-time information dissemination during crises, the authors propose TweetCred as a pragmatic solution to assess the credibility of tweets, enhancing user experience and trust in social media content.

TweetCred employs a semi-supervised ranking model that uses SVM-rank to evaluate the credibility of tweets based on an extensive set of 45 features. These features are extracted from both the tweet's content and the author's metadata. Unlike previous methods that relied on post-hoc classification in offline settings, TweetCred operates in real-time, computing a credibility score between 1 and 7 for each tweet without requiring historical data about the user or event. This enables quick and efficient evaluation, providing users with instantaneous feedback on the tweets they encounter.

In evaluating the system's performance, the authors utilized training data from six high-impact crisis events in 2013, annotated through crowdsourcing. The implementation as a browser extension facilitates seamless integration into users' Twitter experiences, offering credibility scores directly within their timelines. The results show that 80% of the credibility scores are calculated and displayed within six seconds, underlining the system's efficiency and suitability for real-time applications.

The significance of TweetCred is underscored by its extensive deployment, with over 1,127 Twitter users engaging the system, resulting in credibility scores for approximately 5.4 million tweets. User feedback indicates a divergence in perceived credibility for some tweets, with users often finding TweetCred's assessments to be more conservative. This points to a potential bias in the model, which prioritizes caution over unfounded credibility, especially when content seems unrelated to crisis events.

From a methodological standpoint, the paper provides a detailed exploration of different learning-to-rank algorithms, ultimately selecting SVM-rank due to its favorable balance between performance accuracy and computational efficiency. The evaluation metrics used, such as NDCG, underscore the robustness of the model, achieving competitive results compared to other techniques like AdaRank and RankBoost.

Looking forward, the paper identifies several areas for future work. These include incorporating personalized trust models to accommodate users' unique trust networks and refining the system's contextual understanding of tweets to better differentiate between factual information and opinion-based content. Such advancements would further enhance TweetCred's applicability and user satisfaction.

In conclusion, TweetCred represents a substantive contribution to the domain of automated credibility assessment, providing a scalable and practical tool for navigating the complexities of rapidly changing information landscapes on social media. As misinformation continues to challenge digital environments, systems like TweetCred will be integral in reinforcing the credibility of digital communications. The findings and technology set a foundation for ongoing research and development in the field of credibility analysis and machine learning applications in social media contexts.

PDF Markdown

TweetCred: Real-Time Credibility Assessment of Content on Twitter (1405.5490v2)

Summary

TweetCred: Real-Time Credibility Assessment of Content on Twitter

Related Papers