LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Yukang Chen; Luozhou Wang; Wei Huang; Shuai Yang; Bohan Zhang; Yicheng Xiao; Ruihang Chu; Weian Mao; Qixin Hu; Shaoteng Liu; Yuyang Zhao; Huizi Mao; Ying-Cong Chen; Enze Xie; Xiaojuan Qi; Song Han

arXiv:2605.18739·cs.CV·May 20, 2026

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Yukang Chen, Luozhou Wang, Wei Huang, Shuai Yang, Bohan Zhang, Yicheng Xiao, Ruihang Chu, Weian Mao, Qixin Hu, Shaoteng Liu, Yuyang Zhao, Huizi Mao, Ying-Cong Chen, Enze Xie, Xiaojuan Qi, Song Han

PDF

1 Repo 4 Models 1 Datasets

TL;DR

LongLive-2.0 introduces a parallel NVFP4 infrastructure for efficient long video generation, significantly improving training and inference speed and memory usage with novel autoregressive training and quantization techniques.

Contribution

It presents the first NVFP4-based system for long video generation, combining sequence-parallel autoregressive training with optimized inference methods.

Findings

01

Up to 2.15x faster training

02

Up to 1.84x faster inference

03

Achieves 45.7 FPS on benchmarks

Abstract

We present LongLive-2.0, an NVFP4-based parallel infrastructure throughout the full training and inference workflow of long video generation, addressing speed and memory bottlenecks. For training, we introduce sequence-parallel autoregressive (AR) training, instantiated as Balanced SP, which co-designs the efficient teacher-forcing layout with SP execution by pairing clean-history and noisy-target temporal chunks on each rank, enabling a natural teacher-forcing mask with SP-aware chunked VAE encoding. Combined with NVFP4 precision, it reduces GPU memory cost and accelerates GEMM computation during training, the proportion of which increases as video length grows. Moreover, we show that a high-quality infrastructure and dataset enable a remarkably clean training pipeline. Unlike existing Self-Forcing series methods that rely on ODE initialization and subsequent distribution matching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nvlabs/LongLive
github

Models

Datasets

Efficient-Large-Model/LongLive2.0-Toy-Dataset
dataset· 403 dl
403 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.