GARDO: Reinforcing Diffusion Models without Reward Hacking

Haoran He; Yuxiao Ye; Jie Liu; Jiajun Liang; Zhiyong Wang; Ziyang Yuan; Xintao Wang; Hangyu Mao; Pengfei Wan; Ling Pan

arXiv:2512.24138·cs.LG·January 1, 2026

GARDO: Reinforcing Diffusion Models without Reward Hacking

Haoran He, Yuxiao Ye, Jie Liu, Jiajun Liang, Zhiyong Wang, Ziyang Yuan, Xintao Wang, Hangyu Mao, Pengfei Wan, Ling Pan

PDF

Open Access

TL;DR

GARDO is a flexible reinforcement learning framework for diffusion models that selectively regularizes uncertain samples, adaptively updates reference models, and boosts rewards for diverse high-quality outputs, effectively reducing reward hacking and improving diversity.

Contribution

GARDO introduces a novel selective regularization and adaptive reference update mechanism to improve diffusion model fine-tuning without reward hacking.

Findings

01

Mitigates reward hacking across various proxy rewards

02

Enhances generation diversity without sacrificing sample efficiency

03

Improves exploration by adaptive regularization

Abstract

Fine-tuning diffusion models via online reinforcement learning (RL) has shown great potential for enhancing text-to-image alignment. However, since precisely specifying a ground-truth objective for visual tasks remains challenging, the models are often optimized using a proxy reward that only partially captures the true goal. This mismatch often leads to reward hacking, where proxy scores increase while real image quality deteriorates and generation diversity collapses. While common solutions add regularization against the reference policy to prevent reward hacking, they compromise sample efficiency and impede the exploration of novel, high-reward regions, as the reference policy is usually sub-optimal. To address the competing demands of sample efficiency, effective exploration, and mitigation of reward hacking, we propose Gated and Adaptive Regularization with Diversity-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCell Image Analysis Techniques · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques