MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Sihui Ji; Xi Chen; Shuai Yang; Xin Tao; Pengfei Wan; Hengshuang Zhao

arXiv:2512.14699·cs.CV·December 17, 2025

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Sihui Ji, Xi Chen, Shuai Yang, Xin Tao, Pengfei Wan, Hengshuang Zhao

PDF

Open Access 1 Models

TL;DR

MemFlow introduces a dynamic memory system for long video generation that retrieves relevant historical frames based on text prompts, ensuring narrative coherence and efficiency with minimal computational overhead.

Contribution

The paper presents MemFlow, a novel memory mechanism that dynamically updates and selectively activates relevant frames for consistent long video narration.

Findings

01

Achieves long-term narrative coherence in video generation.

02

Maintains high efficiency with only 7.9% speed reduction.

03

Compatible with existing streaming video models.

Abstract

The core challenge for streaming video generation is maintaining the content consistency in long context, which poses high requirement for the memory design. Most existing solutions maintain the memory by compressing historical frames with predefined strategies. However, different to-generate video chunks should refer to different historical cues, which is hard to satisfy with fixed strategies. In this work, we propose MemFlow to address this problem. Specifically, before generating the coming chunk, we dynamically update the memory bank by retrieving the most relevant historical frames with the text prompt of this chunk. This design enables narrative coherence even if new event happens or scenario switches in future frames. In addition, during generation, we only activate the most relevant tokens in the memory bank for each query in the attention layers, which effectively guarantees…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
KlingTeam/MemFlow
model· ♡ 11
♡ 11

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Visual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis