AIGVE-MACS: Unified Multi-Aspect Commenting and Scoring Model for AI-Generated Video Evaluation

Xiao Liu; Jiawei Zhang

arXiv:2507.01255·cs.CV·July 3, 2025

AIGVE-MACS: Unified Multi-Aspect Commenting and Scoring Model for AI-Generated Video Evaluation

Xiao Liu, Jiawei Zhang

PDF

Open Access 1 Models

TL;DR

AIGVE-MACS is a unified model that evaluates AI-generated videos by providing both numerical scores and detailed multi-aspect comments, improving interpretability and alignment with human judgment.

Contribution

It introduces AIGVE-MACS, a novel model that combines scoring and multi-aspect commenting for video evaluation, and presents AIGVE-BENCH 2, a large-scale benchmark dataset.

Findings

01

Achieves state-of-the-art correlation with human scores

02

Generates high-quality, multi-aspect comments

03

Enhances video quality by 53.5% through feedback loop

Abstract

The rapid advancement of AI-generated video models has created a pressing need for robust and interpretable evaluation frameworks. Existing metrics are limited to producing numerical scores without explanatory comments, resulting in low interpretability and human evaluation alignment. To address those challenges, we introduce AIGVE-MACS, a unified model for AI-Generated Video Evaluation(AIGVE), which can provide not only numerical scores but also multi-aspect language comment feedback in evaluating these generated videos. Central to our approach is AIGVE-BENCH 2, a large-scale benchmark comprising 2,500 AI-generated videos and 22,500 human-annotated detailed comments and numerical scores across nine critical evaluation aspects. Leveraging AIGVE-BENCH 2, AIGVE-MACS incorporates recent Vision-Language Models with a novel token-wise weighted loss and a dynamic frame sampling strategy to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
xiaoliux/AIGVE-MACS
model· 11 dl· ♡ 3
11 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning