TL;DR
StreamMeCo is a novel memory compression framework for streaming video understanding that reduces memory overhead while maintaining or improving accuracy, enabling faster retrieval and lower costs.
Contribution
It introduces edge-free minmax sampling, edge-aware weight pruning, and a time-decay retrieval mechanism for efficient long-term agent memory compression.
Findings
Achieves 70% memory graph compression with 1.87x speedup in retrieval.
Delivers an average accuracy improvement of 1.0% under compression.
Demonstrates effectiveness on three challenging benchmark datasets.
Abstract
Vision agent memory has shown remarkable effectiveness in streaming video understanding. However, storing such memory for videos incurs substantial memory overhead, leading to high costs in both storage and computation. To address this issue, we propose StreamMeCo, an efficient Stream Agent Memory Compression framework. Specifically, based on the connectivity of the memory graph, StreamMeCo introduces edge-free minmax sampling for the isolated nodes and an edge-aware weight pruning for connected nodes, evicting the redundant memory nodes while maintaining the accuracy. In addition, we introduce a time-decay memory retrieval mechanism to further eliminate the performance degradation caused by memory compression. Extensive experiments on three challenging benchmark datasets (M3-Bench-robot, M3-Bench-web and Video-MME-Long) demonstrate that under 70% memory graph compression, StreamMeCo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
