LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Flow-Based Real-World Super-Resolution
Song Fei, Tian Ye, Sixiang Chen, Zhaohu Xing, Jianyu Lai, Lei Zhu

TL;DR
LucidNFT is a reinforcement learning framework that enhances flow-based real-world image super-resolution by improving fidelity, diversity, and robustness against real degradations through novel evaluation, normalization, and data collection strategies.
Contribution
It introduces LucidConsistency, a degradation-invariant evaluator, a decoupled reward normalization method, and a large-scale real-world degradation dataset for improved super-resolution.
Findings
Improves perceptual quality of flow-based Real-ISR models.
Maintains LR-referenced consistency across diverse real-world scenarios.
Enhances robustness against real-world degradations.
Abstract
Generative real-world image super-resolution (Real-ISR) can synthesize visually convincing details from severely degraded low-resolution (LR) inputs, yet its stochastic sampling makes a critical failure mode hard to avoid: outputs may look sharp but be unfaithful to the LR evidence, exhibiting semantic or structural hallucinations. Preference-based reinforcement learning (RL) is a natural fit because each LR input yields a rollout group of candidate restorations. However, effective alignment in Real-ISR is hindered by three coupled challenges: (i) the lack of an LR-referenced faithfulness signal that is robust to degradation yet sensitive to localized hallucinations, (ii) a rollout-group optimization bottleneck where scalarizing heterogeneous rewards before normalization compresses objective-wise contrasts and weakens DiffusionNFT-style reward-weighted updates, and (iii) limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
