An Assessment on Comprehending Mental Health through Large Language Models (2401.04592v2)
Abstract: Mental health challenges pose considerable global burdens on individuals and communities. Recent data indicates that more than 20% of adults may encounter at least one mental disorder in their lifetime. On the one hand, the advancements in LLMs have facilitated diverse applications, yet a significant research gap persists in understanding and enhancing the potential of LLMs within the domain of mental health. On the other hand, across various applications, an outstanding question involves the capacity of LLMs to comprehend expressions of human mental health conditions in natural language. This study presents an initial evaluation of LLMs in addressing this gap. Due to this, we compare the performance of Llama-2 and ChatGPT with classical Machine as well as Deep learning models. Our results on the DAIC-WOZ dataset show that transformer-based models, like BERT or XLNet, outperform the LLMs.
- An overview of the features of chatbots in mental health: A scoping review. International Journal of Medical Informatics 132 (2019), 103978. https://doi.org/10.1016/j.ijmedinf.2019.103978
- Multi-Task Learning for Mental Health using Social Media Text. arXiv:1712.03538 [cs.CL]
- Say ’YES’ to Positivity: Detecting Toxic Language in Workplace Communications. In Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 2017–2029. https://doi.org/10.18653/v1/2021.findings-emnlp.173
- Desiree Bill and Theodor Eriksson. 2023. Fine-tuning a LLM using Reinforcement Learning from Human Feedback for a Therapy Chatbot Application (Independent thesis Basic level, degree of Bachelor), KTH, School of Electrical Engineering and Computer Science (EECS).
- Assessing the Usability of a Chatbot for Mental Health Care. In Internet Science, Svetlana S. Bodrunova, Olessia Koltsova, Asbjørn Følstad, Harry Halpin, Polina Kolozaridi, Leonid Yuldashev, Anna Smoliarova, and Heiko Niedermayer (Eds.). Springer International Publishing, Cham, 121–132.
- Can AI Help Reduce Disparities in General Medical and Mental Health Care? AMA journal of ethics 21 2 (2019), E167–179. https://api.semanticscholar.org/CorpusID:73498305
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). ACM, New York, NY, USA, 785–794. https://doi.org/10.1145/2939672.2939785
- Challenges of Large Language Models for Mental Health Counseling. arXiv:2311.13857 [cs.CL]
- CLPsych 2015 Shared Task: Depression and PTSD on Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. Association for Computational Linguistics, Denver, Colorado, 31–39. https://doi.org/10.3115/v1/W15-1204
- Passive Diagnosis Incorporating the PHQ-4 for Depression and Anxiety. In Proceedings of the Social Media Mining for Health Applications (SMM4H) Workshop. Florence, Italy.
- First Insights on a Passive Major Depressive Disorder Prediction System with Incorporated Conversational Chatbot. In Irish Conference on Artificial Intelligence and Cognitive Science. Dublin, Ireland.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. http://arxiv.org/abs/1810.04805 cite arxiv:1810.04805Comment: 13 pages.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]
- Facebook language predicts depression in medical records. Proceedings of the National Academy of Sciences 115, 44 (2018), 11203–11208. https://doi.org/10.1073/pnas.1802331115 arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1802331115
- The Capability of Large Language Models to Measure Psychiatric Functioning. arXiv:2308.01834 [cs.CL]
- The Distress Analysis Interview Corpus of human and computer interviews. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association (ELRA), Reykjavik, Iceland, 3123–3128. http://www.lrec-conf.org/proceedings/lrec2014/pdf/508_Paper.pdf
- Understanding and Measuring Psychological Stress using Social Media. ArXiv abs/1811.07430 (2018). https://api.semanticscholar.org/CorpusID:53717562
- Detecting depression and mental illness on social media: an integrative review. Current Opinion in Behavioral Sciences 18 (2017), 43–49. https://doi.org/10.1016/j.cobeha.2017.07.005 Big data in the behavioural sciences.
- Multimodal mental health assessment with remote interviews using facial, vocal, linguistic, and cardiovascular patterns. medRxiv (2023). https://doi.org/10.1101/2023.09.11.23295212
- PsyEval: A Comprehensive Large Language Model Evaluation Benchmark for Mental Health. arXiv:2311.09189 [cs.CL]
- An ultra-brief screening scale for anxiety and depression: The PHQ-4.
- Designing a Chatbot as a Mediator for Promoting Deep Self-Disclosure to a Real Mental Health Professional. Proc. ACM Hum.-Comput. Interact. 4, CSCW1, Article 31 (may 2020), 27 pages. https://doi.org/10.1145/3392836
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. http://arxiv.org/abs/1907.11692 cite arxiv:1907.11692.
- Towards automatic text-based estimation of depression through symptom prediction. Brain Informatics 10, 1 (2023), 4. https://doi.org/10.1186/s40708-023-00185-9
- Large Language Models in Neurology Research and Future Practice. Neurology 101, 23 (2023), 1058–1067. https://doi.org/10.1212/WNL.0000000000207967 arXiv:https://www.neurology.org/doi/pdf/10.1212/WNL.0000000000207967
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108 [cs.CL]
- Predicting Depression and Anxiety on Reddit: A Multi-Task Learning Approach. In Proceedings of the 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (Istanbul, Turkey) (ASONAM ’22). IEEE Press, 427–435. https://doi.org/10.1109/ASONAM55673.2022.10068655
- Large language models could change the future of behavioral healthcare: A proposal for responsible development and evaluation. https://doi.org/10.31234/osf.io/cuzvr
- A Call to Action on Assessing and Mitigating Bias in Artificial Intelligence Applications for Mental Health. Perspectives on Psychological Science 18, 5 (Sept. 2023), 1062–1096. https://doi.org/10.1177/17456916221134490 Publisher Copyright: © The Author(s) 2022..
- LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971 [cs.CL]
- Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
- Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- Leveraging Collaborative-Filtering for Personalized Behavior Modeling: A Case Study of Depression Detection among College Students. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 1, Article 41 (mar 2021), 27 pages. https://doi.org/10.1145/3448107
- Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data. arXiv:2307.14385 [cs.CL]
- MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models. arXiv:2309.13567 [cs.CL]
- XLNet: Generalized Autoregressive Pretraining for Language Understanding. Curran Associates Inc., Red Hook, NY, USA.