Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 14 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 159 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

A Framework for Evaluating Appropriateness, Trustworthiness, and Safety in Mental Wellness AI Chatbots (2407.11387v1)

Published 16 Jul 2024 in cs.HC

Abstract: LLM chatbots are susceptible to biases and hallucinations, but current evaluations of mental wellness technologies lack comprehensive case studies to evaluate their practical applications. Here, we address this gap by introducing the MHealth-EVAL framework, a new role-play based interactive evaluation method designed specifically for evaluating the appropriateness, trustworthiness, and safety of mental wellness chatbots. We also introduce Psyfy, a new chatbot leveraging LLMs to facilitate transdiagnostic Cognitive Behavioral Therapy (CBT). We demonstrate the MHealth-EVAL framework's utility through a comparative study of two versions of Psyfy against standard baseline chatbots. Our results showed that Psyfy chatbots outperformed the baseline chatbots in delivering appropriate responses, engaging users, and avoiding untrustworthy responses. However, both Psyfy and the baseline chatbots exhibited some limitations, such as providing predominantly US-centric resources. While Psyfy chatbots were able to identify most unsafe situations and avoid giving unsafe responses, they sometimes struggled to recognize subtle harmful intentions when prompted in role play scenarios. Our study demonstrates a practical application of the MHealth-EVAL framework and showcases Psyfy's utility in harnessing LLMs to enhance user engagement and provide flexible and appropriate responses aligned with an evidence-based CBT approach.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.