Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion

Haodong Li; Shaoteng Liu; Zhe Lin; Manmohan Chandraker

arXiv:2602.07775·cs.CV·May 5, 2026

Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion

Haodong Li, Shaoteng Liu, Zhe Lin, Manmohan Chandraker

PDF

2 Repos

TL;DR

Rolling Sink is a training-free method that extends autoregressive video diffusion models to ultra-long durations, maintaining visual quality and temporal coherence beyond limited training horizons.

Contribution

It introduces Rolling Sink, a novel approach that bridges the train-test gap for long-video synthesis without additional training, based on analysis of AR cache management.

Findings

01

Enables 5-30 minute video synthesis at 16 FPS with stable quality.

02

Achieves superior long-horizon visual fidelity and temporal consistency.

03

Built on Self Forcing, effective with only 5s clip training.

Abstract

Recently, autoregressive (AR) video diffusion models have achieved remarkable performance. However, due to their limited training durations, a train-test gap emerges when testing at longer horizons, leading to rapid visual degradations. Following Self Forcing, which studies the train-test gap within the training duration, this work studies the train-test gap beyond the training duration, i.e., the gap between the limited horizons during training and open-ended horizons during testing. Since open-ended testing can extend beyond any finite training window, and long-video training is computationally expensive, we pursue a training-free solution to bridge this gap. To explore a training-free solution, we conduct a systematic analysis of AR cache maintenance. These insights lead to Rolling Sink. Built on Self Forcing (trained on only 5s clips), Rolling Sink effectively scales the AR video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.