DepecheMood: a Lexicon for Emotion Analysis from Crowd-Annotated News

Published 7 May 2014 in cs.CL and cs.CY | (1405.1605v1)

Abstract: While many lexica annotated with words polarity are available for sentiment analysis, very few tackle the harder task of emotion analysis and are usually quite limited in coverage. In this paper, we present a novel approach for extracting - in a totally automated way - a high-coverage and high-precision lexicon of roughly 37 thousand terms annotated with emotion scores, called DepecheMood. Our approach exploits in an original way 'crowd-sourced' affective annotation implicitly provided by readers of news articles from rappler.com. By providing new state-of-the-art performances in unsupervised settings for regression and classification tasks, even using a na\"{\i}ve approach, our experiments show the beneficial impact of harvesting social media data for affective lexicon building.

Abstract PDF Upgrade to Chat

Authors (2)

Citations (187)

View on Semantic Scholar

Summary

The paper presents a novel, crowd-sourced emotion lexicon (DepecheMood) derived from user interactions on rappler.com for nuanced emotion analysis.
It employs automated techniques to extract emotion scores for 37,000 terms, achieving superior performance in both regression and classification tasks.
The study highlights the lexicon’s practical utility for digital content analysis, offering enhanced capabilities for buzz monitoring and crisis management.

An Expert Review of "DepecheMood: A Lexicon for Emotion Analysis from Crowd-Annotated News"

The paper "DepecheMood: A Lexicon for Emotion Analysis from Crowd-Annotated News" presents a substantial contribution to the field of emotion analysis by developing and sharing a large-scale emotion lexicon, named DepecheMood. The authors, Jacopo Staiano and Marco Guerini, focus on creating a high-coverage and high-precision lexicon using fully automated techniques to parse emotion scores from approximately 37,000 terms. This lexicon is based on crowd-sourced affective annotations derived from user interactions on rappler.com, a social news network.

In this study, the authors illustrate the shortcomings in existing lexica which often limit themselves to sentiment analysis, lacking in the granularity needed for comprehensive emotion analysis. Their work emphasizes the necessity for detailed emotional categories, arguing that simple positive versus negative sentiment annotations may not adequately capture the nuances required for applications such as buzz monitoring and crisis management.

The authors propose a unique approach to lexicon creation, which involves leveraging the inherent crowd-sourced emotional annotations available through the rappler.com Mood Meter. This platform, via user votes, captures the emotional responses elicited by news articles across eight emotional dimensions, thus proving to be a valuable resource for obtaining large volumes of affective data. The resulting Document-by-Emotion Matrix ( $M_{DE}$ ) forms the basis upon which DepecheMood is built.

The paper assesses the validity and utility of the DepecheMood lexicon through experiments employing a public dataset from the SemEval 2007 task on emotion recognition in text. Notably, their unsupervised approaches using DepecheMood exhibit superior performance over existing systems in both regression and classification tasks. In particular, the paper's findings underscore significant improvements in Pearson correlation coefficients for emotions such as fear, anger, surprise, joy, and sadness, surpassing previous systems involved in the SemEval 2007 tasks.

A noteworthy feature of the DepecheMood lexicon is its ability to maintain high coverage of language use, despite containing substantially fewer entries as compared to larger sentiment lexica like SentiWordNet. Coverage levels in headlines from the dataset were comparable to those obtained from sentiment lexica with far more entries, highlighting the practical relevance of this comprehensive resource.

While the authors note some limitations in mapping emotions directly due to translation issues between Rappler's emotional labels and those in the SemEval dataset, the lexicon still demonstrates significant utility. The use of normalized frequencies in creating word-emotion matrices proved beneficial, enhancing classification accuracy in tasks where the simple binary classification of emotions was required.

The paper concludes by setting a foundation for future research prospects, including the refinement of the lexicon through methodologies such as Singular Value Decomposition (SVD), and exploring the connections between mood perceptions and digital content virality.

Overall, "DepecheMood: A Lexicon for Emotion Analysis from Crowd-Annotated News" offers a robust framework for further exploration in the emotion analysis domain, effectively utilizing crowdsourced data for richer emotional lexicons. It paves the way for more nuanced interpretations and applications of emotional data analysis in digital content, reinforcing the value of leveraging user-interaction data at scale.

Markdown Report Issue