MemFly: On-the-Fly Memory Optimization via Information Bottleneck
Zhenyuan Zhang, Xianzhang Jia, Zhiqin Yang, Zhenbo Song, Wei Xue, Sirui Han, Yike Guo

TL;DR
MemFly introduces an information bottleneck-based framework for dynamic memory management in large language models, balancing compression and retrieval accuracy through a hybrid, multi-path retrieval system, leading to improved performance on complex tasks.
Contribution
We propose MemFly, a novel on-the-fly memory optimization framework for LLMs that uses information bottleneck principles and hybrid retrieval mechanisms to enhance memory efficiency and task accuracy.
Findings
Outperforms state-of-the-art baselines in memory coherence.
Improves response fidelity and accuracy.
Efficiently handles complex multi-hop queries.
Abstract
Long-term memory enables large language model agents to tackle complex tasks through historical interactions. However, existing frameworks encounter a fundamental dilemma between compressing redundant information efficiently and maintaining precise retrieval for downstream tasks. To bridge this gap, we propose MemFly, a framework grounded in information bottleneck principles that facilitates on-the-fly memory evolution for LLMs. Our approach minimizes compression entropy while maximizing relevance entropy via a gradient-free optimizer, constructing a stratified memory structure for efficient storage. To fully leverage MemFly, we develop a hybrid retrieval mechanism that seamlessly integrates semantic, symbolic, and topological pathways, incorporating iterative refinement to handle complex multi-hop queries. Comprehensive experiments demonstrate that MemFly substantially outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
