MHDash: An Online Platform for Benchmarking Mental Health-Aware AI Assistants

Yihe Zhang; Cheyenne N Mohawk; Kaiying Han; Vijay Srinivas Tida; Manyu Li; Xiali Hei

arXiv:2602.00353·cs.AI·March 12, 2026

MHDash: An Online Platform for Benchmarking Mental Health-Aware AI Assistants

Yihe Zhang, Cheyenne N Mohawk, Kaiying Han, Vijay Srinivas Tida, Manyu Li, Xiali Hei

PDF

Open Access 1 Datasets

TL;DR

MHDash is an open-source platform that enables detailed, risk-aware evaluation of mental health AI assistants, revealing nuanced performance issues especially in high-risk and multi-turn scenarios.

Contribution

We introduce MHDash, a comprehensive platform for developing, evaluating, and auditing mental health AI systems with fine-grained, risk-sensitive analysis capabilities.

Findings

01

High-risk cases show divergence among LLMs despite similar overall accuracy.

02

Some models rank severity consistently but fail in absolute risk detection.

03

Performance drops in multi-turn dialogues where risk signals are subtle.

Abstract

Large language models (LLMs) are increasingly applied in mental health support systems, where reliable recognition of high-risk states such as suicidal ideation and self-harm is safety-critical. However, existing evaluations primarily rely on aggregate performance metrics, which often obscure risk-specific failure modes and provide limited insight into model behavior in realistic, multi-turn interactions. We present MHDash, an open-source platform designed to support the development, evaluation, and auditing of AI systems for mental health applications. MHDash integrates data collection, structured annotation, multi-turn dialogue generation, and baseline evaluation into a unified pipeline. The platform supports annotations across multiple dimensions, including Concern Type, Risk Level, and Dialogue Intent, enabling fine-grained and risk-aware analysis. Our results reveal several key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

IkeZhang/MHDialog
dataset· 69 dl
69 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Mental Health via Writing · Machine Learning in Healthcare