STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows

Jiatao Gu; Ying Shen; Tianrong Chen; Laurent Dinh; Yuyang Wang; Miguel Angel Bautista; David Berthelot; Josh Susskind; Shuangfei Zhai

arXiv:2511.20462·cs.CV·November 27, 2025

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows

Jiatao Gu, Ying Shen, Tianrong Chen, Laurent Dinh, Yuyang Wang, Miguel Angel Bautista, David Berthelot, Josh Susskind, Shuangfei Zhai

PDF

Open Access 1 Models

TL;DR

STARFlow-V introduces a normalizing flow-based approach for end-to-end video generation, demonstrating high-quality, temporally consistent videos with efficient sampling, challenging the dominance of diffusion models in this domain.

Contribution

The paper presents STARFlow-V, a novel normalizing flow-based video generator with a global-local architecture, flow-score matching, and parallelizable sampling, enabling high-quality autoregressive video synthesis.

Findings

01

Achieves strong visual fidelity and temporal consistency.

02

Supports multiple generation tasks including text-to-video.

03

Provides practical sampling throughput comparable to diffusion models.

Abstract

Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with encouraging progress on image generation. Yet in the video generation domain, where spatiotemporal complexity and computational cost are substantially higher, state-of-the-art systems almost exclusively rely on diffusion-based models. In this work, we revisit this design space by presenting STARFlow-V, a normalizing flow-based video generator with substantial benefits such as end-to-end learning, robust causal prediction, and native likelihood estimation. Building upon the recently proposed STARFlow, STARFlow-V operates in the spatiotemporal latent space with a global-local architecture which restricts causal dependencies to a global latent space while preserving rich local within-frame interactions. This eases error accumulation over time, a common…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
apple/starflow
model· ♡ 282
♡ 282

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks