PreResQ-R1: Towards Fine-Grained Rank-and-Score Reinforcement Learning for Visual Quality Assessment via Preference-Response Disentangled Policy Optimization

Zehui Feng; Tian Qiu; Tong Wu; Junxuan Li; Huayuan Xu; Ting Han

arXiv:2511.05393·cs.CV·November 10, 2025

PreResQ-R1: Towards Fine-Grained Rank-and-Score Reinforcement Learning for Visual Quality Assessment via Preference-Response Disentangled Policy Optimization

Zehui Feng, Tian Qiu, Tong Wu, Junxuan Li, Huayuan Xu, Ting Han

PDF

Open Access

TL;DR

PreResQ-R1 introduces a reinforcement learning framework that unifies absolute and relative quality assessment, achieving state-of-the-art results in image and video quality benchmarks with interpretable reasoning.

Contribution

It proposes a novel Preference-Response Disentangled RL approach with dual-branch rewards and a new optimization scheme for perceptual quality assessment.

Findings

01

Achieves state-of-the-art results on 10 IQA and 5 VQA benchmarks.

02

Surpasses previous methods by 5.30% in IQA and 2.15% in VQA metrics.

03

Produces human-aligned reasoning traces explaining quality judgments.

Abstract

Visual Quality Assessment (QA) seeks to predict human perceptual judgments of visual fidelity. While recent multimodal large language models (MLLMs) show promise in reasoning about image and video quality, existing approaches mainly rely on supervised fine-tuning or rank-only objectives, resulting in shallow reasoning, poor score calibration, and limited cross-domain generalization. We propose PreResQ-R1, a Preference-Response Disentangled Reinforcement Learning framework that unifies absolute score regression and relative ranking consistency within a single reasoning-driven optimization scheme. Unlike prior QA methods, PreResQ-R1 introduces a dual-branch reward formulation that separately models intra-sample response coherence and inter-sample preference alignment, optimized via Group Relative Policy Optimization (GRPO). This design encourages fine-grained, stable, and interpretable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Multimodal Machine Learning Applications