FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution

Yidi Liu; Zihao Fan; Jie Huang; Jie Xiao; Dong Li; Wenlong Zhang; Lei Bai; Xueyang Fu; Zheng-Jun Zha

arXiv:2512.22647·cs.CV·April 7, 2026

FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution

Yidi Liu, Zihao Fan, Jie Huang, Jie Xiao, Dong Li, Wenlong Zhang, Lei Bai, Xueyang Fu, Zheng-Jun Zha

PDF

TL;DR

This paper introduces FinPercep-RM, a fine-grained reward model for super-resolution that localizes distortions, and a co-evolutionary curriculum to stabilize training and improve perceptual quality.

Contribution

It proposes a novel encoder-decoder reward model with a degradation map and a co-evolutionary curriculum learning approach for stable RLHF-based super-resolution.

Findings

01

The fine-grained reward model improves local defect detection.

02

The co-evolutionary curriculum stabilizes training and enhances perceptual quality.

03

Experiments show better global and local super-resolution results.

Abstract

Reinforcement Learning with Human Feedback (RLHF) has proven effective in image generation field guided by reward models to align human preferences. Motivated by this, adapting RLHF for Image Super-Resolution (ISR) tasks has shown promise in optimizing perceptual quality with Image Quality Assessment (IQA) model as reward models. However, the traditional IQA model usually output a single global score, which are exceptionally insensitive to local and fine-grained distortions. This insensitivity allows ISR models to produce perceptually undesirable artifacts that yield spurious high scores, misaligning optimization objectives with perceptual quality and results in reward hacking. To address this, we propose a Fine-grained Perceptual Reward Model (FinPercep-RM) based on an Encoder-Decoder architecture. While providing a global quality score, it also generates a Perceptual Degradation Map…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.