Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Emotion Intensities in Tweets (1708.03696v1)

Published 11 Aug 2017 in cs.CL

Abstract: This paper examines the task of detecting intensity of emotion from text. We create the first datasets of tweets annotated for anger, fear, joy, and sadness intensities. We use a technique called best--worst scaling (BWS) that improves annotation consistency and obtains reliable fine-grained scores. We show that emotion-word hashtags often impact emotion intensity, usually conveying a more intense emotion. Finally, we create a benchmark regression system and conduct experiments to determine: which features are useful for detecting emotion intensity, and, the extent to which two emotions are similar in terms of how they manifest in language.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Saif M. Mohammad (70 papers)
  2. Felipe Bravo-Marquez (8 papers)
Citations (205)

Summary

  • The paper presents the first annotated datasets for anger, fear, sadness, and joy intensities in tweets using Best-Worst Scaling.
  • It reveals that hashtags can significantly amplify the perceived emotion intensity in tweet content.
  • The AffectiveTweets regression system achieves strong Pearson correlations by integrating lexical, embedding, and n-gram features.

Overview of "Emotion Intensities in Tweets"

The paper "Emotion Intensities in Tweets," authored by Saif M. Mohammad and Felipe Bravo-Marquez, investigates the task of determining emotion intensity from textual data, specifically tweets. Unlike traditional emotion detection approaches that focus on categorical outcomes, this paper emphasizes a nuanced measurement of emotions through intensity scores, made possible by the creation of annotated datasets using Best-Worst Scaling (BWS).

Contributions

The paper presents several key contributions to the field of emotion analysis in natural language processing:

  1. Annotated Dataset for Emotion Intensity: This research introduces the first-ever datasets annotated for anger, fear, sadness, and joy intensities within tweets. The annotations were collected using the BWS method, which leverages relative comparisons among items to reduce bias and enhance consistency across annotators.
  2. Impact of Hashtags on Emotion Intensity: By analyzing tweets with and without specific emotion hashtags, the authors provide empirical evidence on how hashtags contribute to perceived emotion intensity. The paper found that hashtags frequently intensify the emotion depicted in a tweet, although this effect is subject to complex interactions between the tweet content and the hashtag.
  3. AffectiveTweets Regression System: The authors develop and evaluate a regression system using a variety of features to benchmark emotion intensity prediction. The system demonstrated significant correlations with human-annotated intensity scores, highlighting the importance of lexical resources, especially those tailored to social media vernacular, in predicting emotional nuances.
  4. Emotion Pair Similarities: The paper explores the linguistic closeness between emotions by training models for one emotion and applying them to another. This experiment sheds light on the asymmetrical nature of emotional expression and identifies pairs like fear and sadness that share linguistic features.

Numerical Results

The paper reports a high split-half reliability, particularly for fear, joy, and sadness intensities, with coefficients ranging from 0.84 to 0.88. The use of lexica features alone yields Pearson correlations as high as 0.63 with ground truth scores. The combination of word embeddings and affect lexicons further improves performance, attaining an average Pearson correlation of 0.66 across the emotions.

Methodological Insights

The paper's use of BWS for annotating sentences illustrates its scalability beyond individual word assessments, offering a promising methodological advancement for fine-grained sentiment analysis. Moreover, the integration of embeddings, affect lexicons, and n-gram features into the regression framework exemplifies a comprehensive feature engineering strategy tailored to the idiosyncrasies of tweet data.

Implications and Future Directions

The work provides a foundational step toward understanding and leveraging nuanced emotional expressions in online communication. Practically, systems that incorporate emotion intensity could better prioritize content in customer service platforms or analyze public sentiment with greater precision during crises. Theoretically, the findings invite further exploration of the cognitive and social dynamics of how emotion is linguistically encoded and perceived in digital microtexts.

For future research, expanding the scope to include more emotions or integrating multimodal data could enhance the robustness and applicability of such systems. Additionally, exploring domain adaptation techniques with the presented methods might provide improved results in cross-contextual settings.

Overall, the paper contributes significantly to the discourse on emotion analysis by providing both a comprehensive dataset and a detailed evaluation framework, opening new avenues for both practical applications and theoretical exploration.