ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents

Huhai Zou; Tianhao Sun; Chuanjiang He; Yu Tian; Zhenyang Li; Li Jin; Nayu Liu; Jiang Zhong; Kaiwen Wei

arXiv:2601.07582·cs.CL·January 14, 2026

ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents

Huhai Zou, Tianhao Sun, Chuanjiang He, Yu Tian, Zhenyang Li, Li Jin, Nayu Liu, Jiang Zhong, Kaiwen Wei

PDF

Open Access

TL;DR

ES-Mem introduces a hierarchical memory system with event segmentation to improve long-term dialogue coherence and context retrieval, addressing limitations of existing flat memory models.

Contribution

The paper presents a novel event segmentation-based memory framework that enhances semantic coherence and structural context navigation in long-term dialogue agents.

Findings

01

Outperforms baseline methods on two memory benchmarks.

02

Robust event segmentation applicable to dialogue datasets.

03

Hierarchical memory improves context localization.

Abstract

Memory is critical for dialogue agents to maintain coherence and enable continuous adaptation in long-term interactions. While existing memory mechanisms offer basic storage and retrieval capabilities, they are hindered by two primary limitations: (1) rigid memory granularity often disrupts semantic integrity, resulting in fragmented and incoherent memory units; (2) prevalent flat retrieval paradigms rely solely on surface-level semantic similarity, neglecting the structural cues of discourse required to navigate and locate specific episodic contexts. To mitigate these limitations, drawing inspiration from Event Segmentation Theory, we propose ES-Mem, a framework incorporating two core components: (1) a dynamic event segmentation module that partitions long-term interactions into semantically coherent events with distinct boundaries; (2) a hierarchical memory architecture that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems