MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
Zeyu Zhang, Quanyu Dai, Luyu Chen, Zeren Jiang, Rui Li, Jieming Zhu,, Xu Chen, Yi Xie, Zhenhua Dong, Ji-Rong Wen

TL;DR
MemSim is a Bayesian simulation framework that automatically generates reliable evaluation datasets to objectively assess the memory capabilities of LLM-based personal assistants, addressing a key challenge in the field.
Contribution
We introduce MemSim, a Bayesian simulator with a causal generation mechanism for automatic, scalable, and reliable memory evaluation of LLM-based agents.
Findings
MemSim effectively generates diverse evaluation datasets.
The MemDaily dataset enables benchmarking of memory mechanisms.
Our experiments demonstrate the simulator's reliability in assessing memory performance.
Abstract
LLM-based agents have been widely applied as personal assistants, capable of memorizing information from user messages and responding to personal queries. However, there still lacks an objective and automatic evaluation on their memory capability, largely due to the challenges in constructing reliable questions and answers (QAs) according to user messages. In this paper, we propose MemSim, a Bayesian simulator designed to automatically construct reliable QAs from generated user messages, simultaneously keeping their diversity and scalability. Specifically, we introduce the Bayesian Relation Network (BRNet) and a causal generation mechanism to mitigate the impact of LLM hallucinations on factual information, facilitating the automatic creation of an evaluation dataset. Based on MemSim, we generate a dataset in the daily-life scenario, named MemDaily, and conduct extensive experiments to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Business Process Modeling and Analysis · Multi-Agent Systems and Negotiation
