Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 89 tok/s
Gemini 3.0 Pro 56 tok/s
Gemini 2.5 Flash 158 tok/s Pro
Kimi K2 198 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Mental Illness Classification on Social Media Texts using Deep Learning and Transfer Learning (2207.01012v1)

Published 3 Jul 2022 in cs.LG, cs.CL, and cs.CY

Abstract: Given the current social distance restrictions across the world, most individuals now use social media as their major medium of communication. Millions of people suffering from mental diseases have been isolated due to this, and they are unable to get help in person. They have become more reliant on online venues to express themselves and seek advice on dealing with their mental disorders. According to the World health organization (WHO), approximately 450 million people are affected. Mental illnesses, such as depression, anxiety, etc., are immensely common and have affected an individuals' physical health. Recently AI methods have been presented to help mental health providers, including psychiatrists and psychologists, in decision making based on patients' authentic information (e.g., medical records, behavioral data, social media utilization, etc.). AI innovations have demonstrated predominant execution in numerous real-world applications broadening from computer vision to healthcare. This study analyzes unstructured user data on the Reddit platform and classifies five common mental illnesses: depression, anxiety, bipolar disorder, ADHD, and PTSD. We trained traditional machine learning, deep learning, and transfer learning multi-class models to detect mental disorders of individuals. This effort will benefit the public health system by automating the detection process and informing appropriate authorities about people who require emergency assistance.

Citations (16)

Summary

  • The paper demonstrates advanced use of deep and transfer learning to classify mental health conditions from social media texts.
  • It shows that transformer models, especially RoBERTa, outperform traditional and deep learning techniques with an accuracy of 0.83.
  • Results highlight potential for automated mental health screening and suggest a shift toward multi-label classification frameworks.

Mental Illness Classification on Social Media Texts using Deep Learning and Transfer Learning

The paper "Mental Illness Classification on Social Media Texts using Deep Learning and Transfer Learning" (2207.01012) presents a comprehensive paper on classifying mental illness prevalence through text analysis of social media posts, particularly those on Reddit. The authors explore the use of machine learning, deep learning, and transfer learning to develop models capable of detecting common mental disorders such as depression, anxiety, bipolar disorder, ADHD, and PTSD. This classification approach aims to support public health systems by autonomously identifying individuals in need of assistance.

Introduction to Mental Health Detection Approaches

The paper begins by addressing the critical importance of detecting mental health issues amid the increased online presence due to social isolation measures, such as those observed during the COVID-19 pandemic. Traditionally, mental illness diagnoses have depended heavily on self-reported symptoms rather than laboratory tests, which opens the door to utilizing AI for enhanced analysis.

Recent advancements in AI have been applied to analyze large datasets of unstructured text data from social media platforms, allowing researchers to leverage these methods in understanding mental health conditions more robustly. This paper favors the analysis of Reddit posts due to their rich, expressive nature, unlike the previously popular Twitter data.

Problem Description and Dataset Analysis

The multi-class classification problem in this paper involves categorizing Reddit posts into one of five mental illness classes, or a "none" category indicative of no illness. The dataset utilized encompasses 16,930 posts, already pre-processed for analysis, with statistics detailed in (Figure 1). Figure 1

Figure 1: Mental Illness Dataset Statistics.

The methodology opted for further text preprocessing to optimize model inputs by normalizing text and removing non-essential components.

Methodological Framework

Machine Learning Techniques

Traditional machine learning models such as Random Forest, Support Vector Machine, Naive Bayes, and Logistic Regression were applied using word n-grams with TF-IDF values. Despite yielding reasonable predictive capability, ML models require manual feature engineering, a time-intensive process that deep learning approaches can potentially mitigate.

Deep Learning Architectures

Deep learning, with models such as GRU, LSTM, and CNN, form the backbone of advanced text analysis approaches in this paper. Notably, the Bi-LSTM model emerged as the top performer among DL methods, highlighting its ability to capture temporal sequences effectively.

Transfer Learning Models

The paper's seminal contribution pertains to its successful application of transformer-based models, notably BERT, XLNet, and RoBERTa. RoBERTa, markedly, outshines both traditional and deep learning counterparts, attaining an accuracy of 0.83, underscoring the prospect of transfer learning in extracting nuanced contextual information from texts. Figure 2

Figure 2: RoBERTa confusion matrix.

The confusion matrix (Figure 2) portrays RoBERTa's robust performance across multiple classes, especially in distinguishing non-mental illness posts with minimal false positives.

Results Discussion

The paper presents detailed numerical results, demonstrating RoBERTa's prominent classification efficacy. With precision and recall scores significantly higher than its peers, the model's capacity to decipher complex patterns revolving around mental health lingo prompts a discussion on its broader implications.

Despite encouraging outcomes, certain classes, such as depression and anxiety, reveal lower F1-scores. This potentially stems from the posts' brevity and semantic overlaps with other disorder contexts, challenging straightforward classification.

The authors argue for the potential evolution of classification tasks from multi-class to multi-label formats to better capture real-world complexities.

Conclusion

In sum, this research underscores the transformative power of transfer learning models like RoBERTa in mental health classifications, with significant implications for automated monitoring within the public health domain. It sets the stage for future exploration, suggesting advancements towards multi-label classification frameworks and ensemble modeling to bolster detection reliability.

This paper represents a stride toward leveraging AI for societal good, with sizable implications for clinical psychological assistance and public health strategies during crises of social isolation.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.