TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health
Zixin Xiong, Ziteng Wang, Haotian Fan, Xinjie Zhang, Wenxuan Wang

TL;DR
This paper introduces TrustMH-Bench, a comprehensive framework for evaluating the trustworthiness of large language models in mental health, addressing domain-specific concerns often overlooked by general evaluation methods.
Contribution
We propose TrustMH-Bench, a holistic benchmark that systematically assesses mental health LLMs across eight trustworthiness dimensions, filling a critical evaluation gap.
Findings
Models underperform in trustworthiness metrics in mental health scenarios.
Even advanced models like GPT-5.1 show significant deficiencies.
Systematic improvements are necessary for deploying trustworthy mental health LLMs.
Abstract
While Large Language Models (LLMs) demonstrate significant potential in providing accessible mental health support, their practical deployment raises critical trustworthiness concerns due to the domains high-stakes and safety-sensitive nature. Existing evaluation paradigms for general-purpose LLMs fail to capture mental health-specific requirements, highlighting an urgent need to prioritize and enhance their trustworthiness. To address this, we propose TrustMH-Bench, a holistic framework designed to systematically quantify the trustworthiness of mental health LLMs. By establishing a deep mapping from domain-specific norms to quantitative evaluation metrics, TrustMH-Bench evaluates models across eight core pillars: Reliability, Crisis Identification and Escalation, Safety, Fairness, Privacy, Robustness, Anti-sycophancy, and Ethics. We conduct extensive experiments across six…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Digital Mental Health Interventions · Artificial Intelligence in Healthcare and Education
