VersusQ: Pairwise Margin Reasoning for Generalizable Video Quality Assessment
Shibei Meng, Binxin Yang, Yuan Liu, Jiexuan Zhang, Zhengyao Lv, Hubery Yin, Qiang Xu

TL;DR
VersusQ introduces a pairwise comparison framework for video quality assessment using large multimodal models, improving cross-domain generalization and ranking accuracy by focusing on perceptual differences rather than absolute scores.
Contribution
It proposes VersusQ, a novel pairwise margin reasoning method that enhances generalization and ranking in video quality assessment by leveraging direct comparisons.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Demonstrates strong cross-domain generalization.
Provides reliable fine-grained ranking in diverse scenarios.
Abstract
Large Multimodal Models (LMMs) have shown promise for video quality assessment, but most methods still predict an absolute score for each video. Such pointwise supervision often mixes perceptual quality with dataset-specific calibration, including annotation protocols, rating habits, and score distributions. As a result, the learned scoring rule may work well within a benchmark but transfer poorly across unseen domains. We argue that relative comparisons alleviate the absolute-scale calibration bias by focusing purely on perceptual differences rather than dataset-specific rating habits. Consequently, we propose \textbf{VersusQ}, a pairwise margin reasoning framework driven entirely by direct comparisons. Specifically, VersusQ performs LMM-based comparison between two videos, reasons about their visual and temporal quality differences, and predicts a signed continuous margin that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
