MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
Haibo Tong, Zhaoyang Wang, Zhaorun Chen, Haonian Ji, Shi Qiu, Siwei, Han, Kexin Geng, Zhongkai Xue, Yiyang Zhou, Peng Xia, Mingyu Ding, Rafael, Rafailov, Chelsea Finn, and Huaxiu Yao

TL;DR
This paper introduces MJ-BENCH-VIDEO, a comprehensive benchmark for evaluating video generation quality across multiple aspects, and proposes MJ-VIDEO, a new reward model that improves preference judgment accuracy and enhances video alignment.
Contribution
The paper presents a large-scale, fine-grained video preference benchmark and a novel MoE-based reward model that outperforms existing methods in preference assessment and tuning.
Findings
MJ-VIDEO achieves 17.58% improvement in overall preference judgment.
The benchmark covers five critical aspects with 28 criteria.
MJ-VIDEO enhances video alignment in generation tasks.
Abstract
Recent advancements in video generation have significantly improved the ability to synthesize videos from text instructions. However, existing models still struggle with key challenges such as instruction misalignment, content hallucination, safety concerns, and bias. Addressing these limitations, we introduce MJ-BENCH-VIDEO, a large-scale video preference benchmark designed to evaluate video generation across five critical aspects: Alignment, Safety, Fineness, Coherence & Consistency, and Bias & Fairness. This benchmark incorporates 28 fine-grained criteria to provide a comprehensive evaluation of video preference. Building upon this dataset, we propose MJ-VIDEO, a Mixture-of-Experts (MoE)-based video reward model designed to deliver fine-grained reward. MJ-VIDEO can dynamically select relevant experts to accurately judge the preference based on the input text-video pair. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Generative Adversarial Networks and Image Synthesis · Multimedia Communication and Technology
