Detecting Stealth Sycophancy in Mental-Health Dialogue with Dynamic Emotional Signature Graphs
Tianze Han, Beining Xu, Hanbo Zhang, Yongming Lu

TL;DR
This paper introduces Dynamic Emotional Signature Graphs (DESG), a novel model-agnostic method for evaluating therapeutic dialogue quality by representing clinical states and trajectories, outperforming existing metrics.
Contribution
The paper proposes DESG, a new approach that models dialogue clinical states with asymmetric geometry, improving offline evaluation of mental health dialogues.
Findings
DESG-Ensemble achieves 0.9353 macro-F1 on a diagnostic benchmark.
DESG outperforms ConcatANN, BERTScore, and TRACT in evaluation metrics.
Clinical state manifold is key to discriminating therapeutic dialogue quality.
Abstract
As conversational AI therapists are increasingly used in psychological support settings, reliable offline evaluation of therapeutic response quality remains an open problem. This paper studies multi-domain support-dialogue evaluation without relying on large language models as final judges. We use a direct LLM judge as a baseline that reads raw dialogue text and predicts whether the target response is harmful, productive, or neutral. We find that direct LLM judges and symmetric text-similarity metrics are poorly aligned with therapeutic quality because the target label depends on clinical direction: whether the response moves the user state toward regulation or reframing, leaves it broadly unchanged, or reinforces deterioration through higher risk affect or cognitive-distortion mass. To address this issue, we propose Dynamic Emotional Signature Graphs (DESG), a model-agnostic evaluator…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
