FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion

Xiangyang Luo; Qingyu Li; Xiaokun Liu; Wenyu Qin; Miao Yang; Meng Wang; Pengfei Wan; Di Zhang; Kun Gai; Shao-Lun Huang

arXiv:2512.11274·cs.CV·December 15, 2025

FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion

Xiangyang Luo, Qingyu Li, Xiaokun Liu, Wenyu Qin, Miao Yang, Meng Wang, Pengfei Wan, Di Zhang, Kun Gai, Shao-Lun Huang

PDF

Open Access 1 Video

TL;DR

FilmWeaver introduces a novel autoregressive diffusion framework with a dual-level cache mechanism to generate multi-shot videos that maintain character and background consistency across shots and support arbitrary length and user interaction.

Contribution

The paper presents a new framework that decouples inter-shot and intra-shot consistency, enabling flexible, multi-shot video generation with high consistency and supporting downstream tasks.

Findings

01

Outperforms existing methods on consistency metrics

02

Supports multi-concept injection and video extension

03

Achieves high aesthetic quality in generated videos

Abstract

Current video generation models perform well at single-shot synthesis but struggle with multi-shot videos, facing critical challenges in maintaining character and background consistency across shots and flexibly generating videos of arbitrary length and shot count. To address these limitations, we introduce \textbf{FilmWeaver}, a novel framework designed to generate consistent, multi-shot videos of arbitrary length. First, it employs an autoregressive diffusion paradigm to achieve arbitrary-length video generation. To address the challenge of consistency, our key insight is to decouple the problem into inter-shot consistency and intra-shot coherence. We achieve this through a dual-level cache mechanism: a shot memory caches keyframes from preceding shots to maintain character and scene identity, while a temporal memory retains a history of frames from the current shot to ensure smooth,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Human Pose and Action Recognition