Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory

Jinzhuo Liu; Jiangning Zhang; Wencan Jiang; Yabiao Wang; Dingkang Liang; Zhucun Xue; Ran Yi; Yong Liu

arXiv:2605.18733·cs.CV·May 19, 2026

Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory

Jinzhuo Liu, Jiangning Zhang, Wencan Jiang, Yabiao Wang, Dingkang Liang, Zhucun Xue, Ran Yi, Yong Liu

PDF

2 Models

TL;DR

This paper introduces IAMFlow, a training-free, identity-aware memory framework for long video generation that maintains consistent entity identities across prompts, improving quality and speed.

Contribution

It proposes a novel explicit entity tracking method using LLMs and VLMs, along with a new benchmark for narrative streaming video generation.

Findings

01

IAMFlow outperforms baselines by 2.56 points on NarraStream-Bench.

02

It achieves a 1.39× speedup over the most efficient baseline.

03

The framework effectively maintains entity identities across prompts.

Abstract

Autoregressive video generation has improved rapidly in visual fidelity and interactivity, but it still suffers from long-term inconsistency and memory degradation. Most existing solutions either compress historical frames using predefined strategies or retrieve keyframes based on coarse implicit attention signals, both of which fail to handle evolving prompts with shifting entity references, leading to identity drift, character duplication, and attribute loss. To address this, we propose IAMFlow, a training-free identity-aware memory framework that explicitly models and tracks persistent entity identities, enabling consistent generation across prompt transitions. Specifically, an LLM extracts entities with visual attributes from each prompt and assigns unique global IDs for identity-aware memory, while a VLM asynchronously verifies and refines attributes from rendered frames, enabling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.