Twitter Sentiment Analysis: Lexicon Method, Machine Learning Method and Their Combination

Published 3 Jul 2015 in cs.CL, cs.IR, cs.LG, stat.ME, and stat.ML | (1507.00955v3)

Abstract: This paper covers the two approaches for sentiment analysis: i) lexicon based method; ii) machine learning method. We describe several techniques to implement these approaches and discuss how they can be adopted for sentiment classification of Twitter messages. We present a comparative study of different lexicon combinations and show that enhancing sentiment lexicons with emoticons, abbreviations and social-media slang expressions increases the accuracy of lexicon-based classification for Twitter. We discuss the importance of feature generation and feature selection processes for machine learning sentiment classification. To quantify the performance of the main sentiment analysis methods over Twitter we run these algorithms on a benchmark Twitter dataset from the SemEval-2013 competition, task 2-B. The results show that machine learning method based on SVM and Naive Bayes classifiers outperforms the lexicon method. We present a new ensemble method that uses a lexicon based sentiment score as input feature for the machine learning approach. The combined method proved to produce more precise classifications. We also show that employing a cost-sensitive classifier for highly unbalanced datasets yields an improvement of sentiment classification performance up to 7%.

Abstract PDF Upgrade to Chat

Authors (4)

Citations (506)

View on Semantic Scholar

Summary

The paper introduces a hybrid approach that integrates lexicon-based sentiment scores with machine learning classifiers to enhance tweet analysis.
Lexicon enhancement with emoticons, slang, and abbreviations significantly improves classification accuracy over standard lexicons.
Cost-sensitive SVM and Naive Bayes models are shown to outperform standalone methods, achieving up to 7% performance gains on imbalanced datasets.

Lexicon-Based Sentiment Analysis

The paper presents a detailed study of lexicon-based sentiment analysis tailored for Twitter data. In this approach, pre-defined sentiment lexicons serve as the foundation for computing semantic orientations of tweet text. The study investigates multiple lexicon configurations, including:

OL (Opinion Lexicon): A standard lexicon containing sentiment-labeled terms.
OL + EMO: An enhanced lexicon that incorporates emoticons, abbreviations, and social-media slang—elements that are critical given the informal nature of Twitter messages. Empirical results highlight that supplementing the opinion lexicon with these non-standard tokens leads to marked improvements in classification accuracy.
OL + EMO + AUTO: An extension of OL + EMO with an automatically generated lexicon. Although intuitively appealing, this configuration was observed to introduce ambiguity and complexity, resulting in diminished performance compared to the manually curated OL + EMO lexicon.

The analysis reinforces that adapting lexicons specifically for Twitter—by including emoticons and social media expressions—yields superior performance relative to generic sentiment lexicons.

Machine Learning Approaches

In contrast to lexicon-based methods, the paper details machine learning frameworks for sentiment classification operating on Twitter data. Specifically, the study employs:

Feature Generation and Selection: Emphasis is placed on deriving robust features using n-grams, term frequencies, and positional information. The paper underscores the importance of careful feature engineering; notably, features derived from the lexicon-based sentiment score ranked highly according to information gain metrics.
Classification Algorithms: The investigation integrates Support Vector Machines (SVM) and Naive Bayes classifiers. Both models are trained on labeled datasets sourced from the SemEval-2013 Task 2-B benchmark. The empirical evaluation demonstrates that machine learning classifiers generally outperform pure lexicon-based approaches.

Moreover, the study contrasts standard classification techniques with cost-sensitive classifiers. Given the inherent class imbalance in Twitter sentiment data, applying cost-sensitive SVMs—in which misclassification costs are adjusted—yielded up to 7% performance improvements. This adjustment is particularly relevant when handling cases with skewed class distributions, ensuring robust performance across all sentiment classes.

Combined Method: An Ensemble Approach

A notable contribution of the paper is the introduction of an ensemble method that effectively leverages the strengths of both lexicon and machine learning approaches. The key innovation lies in the integration of the lexicon-based sentiment score directly as an additional feature in the feature vector for machine learning classifiers. This hybrid strategy is technically straightforward yet conceptually robust:

The lexicon-based sentiment score, computed via the enhanced OL + EMO lexicon, serves as a high-information feature.
When included with other engineered textual features, this additional signal improves the discriminative capacity of classifiers such as SVM and Naive Bayes.
Experimental results confirm that the combined method consistently outperforms each individual approach, achieving superior classification precision.

This integration corroborates that even with advanced feature engineering for machine learning, simple lexicon-derived statistics provide substantial value in the high-noise, informal environment of Twitter.

Experimental Setup and Performance Evaluation

The empirical validation employs the benchmark Twitter dataset from the SemEval-2013 Task 2-B competition. Key aspects of the experimental setup include:

Utilization of various lexicon configurations to benchmark lexicon-based methods.
Application of SVM and Naive Bayes classifiers using standard and cost-sensitive training paradigms.
Extensive feature selection procedures to ascertain the relative contribution of each feature, with the lexicon-induced score consistently emerging as a top-ranked feature.

Quantitatively, the results highlight that:

The machine learning methods based on SVM and Naive Bayes surpass the standalone lexicon-based approach.
Incorporating the lexicon-based sentiment score as an additional feature in the classifier further boosts performance.
The deployment of a cost-sensitive classifier leads to performance improvements of up to 7%, particularly addressing the challenges posed by imbalanced datasets.

Practical Implications and Implementation Considerations

For practitioners aiming to deploy similar systems for Twitter sentiment analysis, the following implementation details are critical:

Lexicon Adaptation: Curating sentiment lexicons that explicitly include emoticons, abbreviations, and colloquialisms prevalent in social media significantly enhances performance. Regular updates and domain-specific customization are recommended.
Feature Engineering: Combining traditional bag-of-words features with lexicon-derived sentiment scores can yield superior results. Automated feature selection techniques like information gain analysis are useful for optimizing feature sets.
Classifier Selection: SVM and Naive Bayes are robust baseline models; further, the use of cost-sensitive learning is imperative in contexts with class imbalance. This is especially relevant in commercial environments where sentiment classes may be unevenly represented.
Ensemble Strategies: The integration of lexicon-based scores as features in machine learning ensembles is both computationally efficient and effective. This hybrid method demonstrates that leveraging complementary strengths from different approaches can lead to more precise classifications.

From a deployment perspective, computational considerations include the necessity for scalable feature extraction pipelines and sufficient resources to retrain models periodically as language usage on social media evolves. The cost-sensitive methods, while slightly more demanding in terms of parameter tuning, offer tangible benefits in imbalanced classification scenarios.

In summary, the research provides a comprehensive framework combining lexicon-based and machine learning methods that is both technically rigorous and practically effective for Twitter sentiment analysis. The insights on lexicon enhancement, feature integration, and cost-sensitive classification offer a robust guide for practitioners aiming to achieve high-precision sentiment classification in noisy, real-world social media environments.

Markdown Report Issue