MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents
Jiayi Chen, Yingcong Li, Guiling Wang

TL;DR
MemFlow is a novel framework that externalizes memory management for small language models, improving long-horizon reasoning by intent-based routing and specialized retrieval strategies.
Contribution
It introduces a training-free, route-then-compile memory orchestration method that significantly enhances accuracy in long-horizon tasks for small language models.
Findings
MemFlow nearly doubles accuracy over full-context baselines.
Structured intent routing improves evidence relevance and reasoning.
Dynamic tier-aware memory management reduces context overflow.
Abstract
Modern language agents must operate over long-horizon, multi-turn histories, yet deploying such agents with Small Language Models (SLMs) remains fundamentally difficult. Full-context prompting causes context overflow, flat retrieval exposes the model to noisy evidence, and open-ended agentic loops are unreliable under limited reasoning capacity. We argue that a substantial portion of SLM memory failure arises from mismatched memory operations: different query types demand categorically different retrieval strategies, evidence transformations, and context budgets that SLMs cannot reliably self-orchestrate through open-ended reasoning. We introduce MemFlow, a training-free memory orchestration framework that externalizes memory planning from the SLM. A Router Agent classifies each query by intent and dispatches it to the Memory Agent, which executes one of three specialized tiers (Profile…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
