ES-MemEval: Benchmarking Conversational Agents on Personalized Long-Term Emotional Support
Tiantian Chen, Jiaqi Lu, Ying Shen, and Lin Zhang

TL;DR
This paper introduces ES-MemEval, a benchmark for evaluating long-term memory capabilities of conversational agents in emotional support, along with EvoEmo, a dataset capturing implicit, evolving user disclosures, revealing current models' strengths and limitations.
Contribution
It presents a new benchmark and dataset specifically designed for assessing and advancing long-term memory in emotional support dialogue systems.
Findings
Explicit memory reduces hallucinations and improves personalization.
Retrieval-augmented models enhance factual accuracy but struggle with temporal dynamics.
Current models show limitations in handling evolving user states.
Abstract
Large Language Models (LLMs) have shown strong potential as conversational agents. Yet, their effectiveness remains limited by deficiencies in robust long-term memory, particularly in complex, long-term web-based services such as online emotional support. However, existing long-term dialogue benchmarks primarily focus on static and explicit fact retrieval, failing to evaluate agents in critical scenarios where user information is dispersed, implicit, and continuously evolving. To address this gap, we introduce ES-MemEval, a comprehensive benchmark that systematically evaluates five core memory capabilities: information extraction, temporal reasoning, conflict detection, abstention, and user modeling, in long-term emotional support settings, covering question answering, summarization, and dialogue generation tasks. To support the benchmark, we also propose EvoEmo, a multi-session dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Mental Health via Writing · Digital Mental Health Interventions
