HunyuanVideo: A Systematic Framework For Large Video Generative Models

Weijie Kong; Qi Tian; Zijian Zhang; Rox Min; Zuozhuo Dai; Jin Zhou,; Jiangfeng Xiong; Xin Li; Bo Wu; Jianwei Zhang; Kathrina Wu; Qin Lin; Junkun; Yuan; Yanxin Long; Aladdin Wang; Andong Wang; Changlin Li; Duojun Huang; Fang; Yang; Hao Tan; Hongmei Wang; Jacob Song; Jiawang Bai; Jianbing Wu; Jinbao; Xue; Joey Wang; Kai Wang; Mengyang Liu; Pengyu Li; Shuai Li; Weiyan Wang,; Wenqing Yu; Xinchi Deng; Yang Li; Yi Chen; Yutao Cui; Yuanbo Peng; Zhentao; Yu; Zhiyu He; Zhiyong Xu; Zixiang Zhou; Zunnan Xu; Yangyu Tao; Qinglin Lu,; Songtao Liu; Dax Zhou; Hongfa Wang; Yong Yang; Di Wang; Yuhong Liu; Jie; Jiang; Caesar Zhong (refer to the report for detailed contributions)

arXiv:2412.03603·cs.CV·March 12, 2025·6 cites

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Weijie Kong, Qi Tian, Zijian Zhang, Rox Min, Zuozhuo Dai, Jin Zhou,, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, Kathrina Wu, Qin Lin, Junkun, Yuan, Yanxin Long, Aladdin Wang, Andong Wang, Changlin Li, Duojun Huang, Fang, Yang, Hao Tan, Hongmei Wang, Jacob Song, Jiawang Bai

PDF

Open Access 2 Repos 10 Models

TL;DR

HunyuanVideo is an open-source large-scale video generative model that achieves state-of-the-art performance, surpassing many existing models, and aims to democratize access to advanced video generation technology.

Contribution

The paper introduces HunyuanVideo, the largest open-source video foundation model with over 13 billion parameters, and presents a comprehensive framework for high-quality, large-scale video generation.

Findings

01

HunyuanVideo outperforms previous state-of-the-art models.

02

The model demonstrates high visual quality and accurate motion dynamics.

03

Open-source release fosters community experimentation and innovation.

Abstract

Recent advancements in video generation have significantly impacted daily life for both individuals and industries. However, the leading video generation models remain closed-source, resulting in a notable performance gap between industry capabilities and those available to the public. In this report, we introduce HunyuanVideo, an innovative open-source video foundation model that demonstrates performance in video generation comparable to, or even surpassing, that of leading closed-source models. HunyuanVideo encompasses a comprehensive framework that integrates several key elements, including data curation, advanced architectural design, progressive model scaling and training, and an efficient infrastructure tailored for large-scale model training and inference. As a result, we successfully trained a video generative model with over 13 billion parameters, making it the largest among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Human Pose and Action Recognition