Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model

Zelu Qi; Ping Shi; Chaoyang Zhang; Shuqi Wang; Fei Zhao; Da Pan; Zefeng Ying

arXiv:2506.04715·cs.CV·June 13, 2025

Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model

Zelu Qi, Ping Shi, Chaoyang Zhang, Shuqi Wang, Fei Zhao, Da Pan, Zefeng Ying

PDF

Open Access 1 Repo

TL;DR

This paper presents a multi-dimensional, LLM-based model for automatic visual quality assessment of AI-generated videos, addressing defects like noise and jitter by decomposing quality into technical, motion, and semantic aspects.

Contribution

It introduces a novel multi-dimensional evaluation framework using LLMs with prompt engineering and LoRA fine-tuning for improved quality assessment of AI-generated videos.

Findings

01

Achieved second place in NTIRE 2025 challenge for AI-generated video quality assessment.

02

Effectively models technical, motion, and semantic quality dimensions.

03

Demonstrates the potential of LLMs in multi-dimensional video quality evaluation.

Abstract

The development of AI-Generated Video (AIGV) technology has been remarkable in recent years, significantly transforming the paradigm of video content production. However, AIGVs still suffer from noticeable visual quality defects, such as noise, blurriness, frame jitter and low dynamic degree, which severely impact the user's viewing experience. Therefore, an effective automatic visual quality assessment is of great importance for AIGV content regulation and generative model improvement. In this work, we decompose the visual quality of AIGVs into three dimensions: technical quality, motion quality, and video semantics. For each dimension, we design corresponding encoder to achieve effective feature representation. Moreover, considering the outstanding performance of large language models (LLMs) in various vision and language tasks, we introduce a LLM as the quality regression module. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qizelu/aigveval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment