LongCat-Video Technical Report

Meituan LongCat Team: Xunliang Cai; Qilong Huang; Zhuoliang Kang; Hongyu Li; Shijun Liang; Liya Ma; Siyu Ren; Xiaoming Wei; Rixu Xie; Tong Zhang

arXiv:2510.22200·cs.CV·October 29, 2025

LongCat-Video Technical Report

Meituan LongCat Team: Xunliang Cai, Qilong Huang, Zhuoliang Kang, Hongyu Li, Shijun Liang, Liya Ma, Siyu Ren, Xiaoming Wei, Rixu Xie, Tong Zhang

PDF

6 Models

TL;DR

LongCat-Video is a large, unified diffusion transformer model capable of efficient, high-quality long video generation across multiple tasks, advancing the development of world models.

Contribution

It introduces a versatile, large-scale video generation model supporting multiple tasks with efficient inference and strong performance, including multi-reward RLHF training.

Findings

01

Supports text-to-video, image-to-video, and video continuation tasks

02

Generates 720p videos within minutes at 30fps

03

Achieves performance comparable to leading models

Abstract

Video generation is a critical pathway toward world models, with efficient long video inference as a key capability. Toward this end, we introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across multiple video generation tasks. It particularly excels in efficient and high-quality long video generation, representing our first step toward world models. Key features include: Unified architecture for multiple tasks: Built on the Diffusion Transformer (DiT) framework, LongCat-Video supports Text-to-Video, Image-to-Video, and Video-Continuation tasks with a single model; Long video generation: Pretraining on Video-Continuation tasks enables LongCat-Video to maintain high quality and temporal coherence in the generation of minutes-long videos; Efficient inference: LongCat-Video generates 720p, 30fps videos within minutes by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.