MemRouter: Memory-as-Embedding Routing for Long-Term Conversational Agents
Tianyu Hu, Weikai Lin, Weizhi Zhang, Jing Ma, Song Wang

TL;DR
MemRouter introduces an embedding-based routing policy for external memory management in long-term conversational agents, outperforming autoregressive LLM-based methods in accuracy and efficiency.
Contribution
A novel lightweight memory router that decouples memory admission from answer generation, improving performance and reducing latency in conversational QA.
Findings
MemRouter outperforms LLM-based memory managers on LoCoMo with higher F1 scores.
Memory management latency reduced from 970ms to 58ms with MemRouter.
Learned admission improves F1 by +10.3 over random storage.
Abstract
Long-term conversational agents must decide which turns to store in external memory, yet recent systems rely on autoregressive LLM generation at every turn to make that decision. We present MemRouter, a write-side memory router that decouples memory admission from the downstream answer backbone and replaces per-turn memory-management decoding with an embedding-based routing policy. MemRouter encodes each turn together with recent context, projects the resulting embeddings through a frozen LLM backbone, and predicts whether the turn should be stored using lightweight classification heads while training only 12M parameters. Under a controlled matched-harness comparison on LoCoMo, where the retrieval pipeline, answer prompts, and QA backbone (Qwen2.5-7B) are held identical, MemRouter outperforms an LLM-based memory manager on every question category (overall F1 52.0 vs 45.6,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
