Stress-Testing Emotional Support Models: Moving from Homogeneous to Diverse Help Seekers
Chaewon Heo, Cheyon Jin, Yohan Jo

TL;DR
This paper introduces a controllable, psychologically-informed seeker simulator for emotional support chatbots, enabling more diverse and realistic testing of chatbot performance across different user profiles.
Contribution
The authors develop a novel Mixture-of-Experts-based simulator that captures behavioral diversity and allows fine-grained control, improving evaluation fidelity for emotional support models.
Findings
The simulator achieves better profile adherence and diversity than existing models.
Evaluation with seven supporter models reveals previously hidden performance issues.
The framework enhances stress-testing of emotional support chatbots.
Abstract
As emotional support chatbots have recently gained significant traction across both research and industry, a common evaluation strategy has emerged: use help-seeker simulators to interact with supporter chatbots. However, current simulators suffer from two critical limitations: (1) they fail to capture the behavioral diversity of real-world seekers, often portraying them as overly cooperative, and (2) they lack the controllability required to simulate specific seeker profiles. To address these challenges, we present a controllable seeker simulator driven by nine psychological and linguistic features that underpin seeker behavior. Using authentic Reddit conversations, we train our model via a Mixture-of-Experts (MoE) architecture, which effectively differentiates diverse seeker behaviors into specialized parameter subspaces, thereby enhancing fine-grained controllability. Our simulator…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
