From "Help" to Helpful: A Hierarchical Assessment of LLMs in Mental e-Health Applications

Philipp Steigerwald; Jens Albrecht

arXiv:2602.18443·cs.HC·February 24, 2026

From "Help" to Helpful: A Hierarchical Assessment of LLMs in Mental e-Health Applications

Philipp Steigerwald, Jens Albrecht

PDF

Open Access

TL;DR

This study evaluates the performance of eleven large language models in generating and assessing email subject lines for German mental health counselling, highlighting trade-offs between proprietary and open-source models and addressing ethical concerns.

Contribution

It introduces a hierarchical assessment framework for LLM-generated counselling email subjects, combining categorization and ranking, and analyzes performance trade-offs and ethical considerations.

Findings

01

German fine-tuning improves model performance

02

Open-source models perform competitively with proprietary ones

03

Ethical issues like privacy and bias are critically addressed

Abstract

Psychosocial online counselling frequently encounters generic subject lines that impede efficient case prioritisation. This study evaluates eleven large language models generating six-word subject lines for German counselling emails through hierarchical assessment - first categorising outputs, then ranking within categories to enable manageable evaluation. Nine assessors (counselling professionals and AI systems) enable analysis via Krippendorff's $α$ , Spearman's $ρ$ , Pearson's $r$ and Kendall's $τ$ . Results reveal performance trade-offs between proprietary services and privacy-preserving open-source alternatives, with German fine-tuning consistently improving performance. The study addresses critical ethical considerations for mental health AI deployment including privacy, bias and accountability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Mental Health via Writing · Artificial Intelligence in Healthcare and Education