AIGVE-MACS: Unified Multi-Aspect Commenting and Scoring Model for AI-Generated Video Evaluation
Xiao Liu, Jiawei Zhang

TL;DR
AIGVE-MACS is a unified model that evaluates AI-generated videos by providing both numerical scores and detailed multi-aspect comments, improving interpretability and alignment with human judgment.
Contribution
It introduces AIGVE-MACS, a novel model that combines scoring and multi-aspect commenting for video evaluation, and presents AIGVE-BENCH 2, a large-scale benchmark dataset.
Findings
Achieves state-of-the-art correlation with human scores
Generates high-quality, multi-aspect comments
Enhances video quality by 53.5% through feedback loop
Abstract
The rapid advancement of AI-generated video models has created a pressing need for robust and interpretable evaluation frameworks. Existing metrics are limited to producing numerical scores without explanatory comments, resulting in low interpretability and human evaluation alignment. To address those challenges, we introduce AIGVE-MACS, a unified model for AI-Generated Video Evaluation(AIGVE), which can provide not only numerical scores but also multi-aspect language comment feedback in evaluating these generated videos. Central to our approach is AIGVE-BENCH 2, a large-scale benchmark comprising 2,500 AI-generated videos and 22,500 human-annotated detailed comments and numerical scores across nine critical evaluation aspects. Leveraging AIGVE-BENCH 2, AIGVE-MACS incorporates recent Vision-Language Models with a novel token-wise weighted loss and a dynamic frame sampling strategy to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
