Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks (1708.04299v1)

Published 14 Aug 2017 in cs.CL

Abstract: While there have been significant advances in detecting emotions from speech and image recognition, emotion detection on text is still under-explored and remained as an active research field. This paper introduces a corpus for text-based emotion detection on multiparty dialogue as well as deep neural models that outperform the existing approaches for document classification. We first present a new corpus that provides annotation of seven emotions on consecutive utterances in dialogues extracted from the show, Friends. We then suggest four types of sequence-based convolutional neural network models with attention that leverage the sequence information encapsulated in dialogue. Our best model shows the accuracies of 37.9% and 54% for fine- and coarse-grained emotions, respectively. Given the difficulty of this task, this is promising.

Citations (202)

View on Semantic Scholar

Summary

The paper introduces a novel sequence-based CNN model enhanced with attention mechanisms, substantially improving emotion classification through sequential dialogue analysis.
The methodology employs two primary architectures, SCNN_c and SCNN_v, achieving 37.9% accuracy and a 26.9% macro-average F1-score on seven emotion categories.
The research provides a valuable annotated corpus from 'Friends' transcripts, paving the way for enhanced real-time emotion detection systems in dialogue-based applications.

Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks

The paper "Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks" presents a novel approach to emotion detection in text, specifically focusing on dialogues extracted from the TV show, "Friends." The authors introduce a new corpus and employ advanced sequence-based convolutional neural networks (SCNNs) to outperform existing document classification methods.

Introduction and Dataset

Emotion detection in text poses unique challenges, especially due to scarcity of annotated datasets. To address this, the authors have developed a comprehensive corpus by annotating dialogues with seven distinct emotions: sad, mad, scared, powerful, peaceful, joyful, and neutral. The dialogues from "Friends" are complex, featuring disfluency and humor, which contribute to the difficulty of the task. A notable effort in this research is the creation of a large-scale dataset with fine-grained annotations, offering substantial potential for further studies.

Methodology

The authors propose a series of SCNN models, leveraging the sequential information intrinsic to dialogues. Unlike traditional CNNs, which may not inherently consider sequence information, the SCNN models are designed to incorporate past utterances to improve emotion classification accuracy. They introduce two primary architectures: SCNN $_c$ and SCNN $_v$ , with each variant further enhanced by attention mechanisms (SCNN $_c^a$ and SCNN $_v^a$ ). These attention-based models dynamically adjust focus among the current and past utterances, potentially weighting previous dialogue inferences more heavily based on their relevance.

Experimental Results

The paper reports promising results, with the SCNN $_c^a$ model achieving an accuracy of 37.9% for fine-grained classification and a macro-average F1-score of 26.9% on a seven-class emotion detection task. For a more generalized classification into three emotion categories (positive, negative, neutral), the model attains 54% accuracy. These results underscore the potential of SCNN architectures to effectively capture sequential dependencies in text dialogues.

Implications

The implications of this research are considerable both theoretically and practically. Theoretically, the use of SCNNs with attention mechanisms provides a robust methodological advancement for sequence-related text tasks, challenging the traditional supremacy of RNNs for such applications. Practically, this approach offers enhancements toward developing real-time text-based emotion detection systems, potentially impacting applications in human-computer interaction and sentiment analysis in conversational agents.

Future Directions

The paper suggests several avenues for future exploration, including extending the annotation of "Friends" transcripts to encompass additional seasons, thus expanding the dataset size. Furthermore, exploring alternative attention mechanism combinations could offer further performance improvements. Given that the field of emotion detection is still developing, especially in text contexts, there are rich opportunities for applying these findings to more diverse and real-world datasets.

In conclusion, this paper contributes significantly to the field of natural language processing by addressing the challenges of text-based emotion detection through innovative sequence-based CNN models with attention mechanisms, coupled with the provision of a novel dataset. As a foundation for future research, this work paves the way for the development of more effective emotion detection systems.

PDF Markdown