Projected Gradient Ascent for Efficient Reward-Guided Updates with One-Step Generative Models
Jisung Hwang, Minhyuk Sung

TL;DR
This paper introduces a new efficient method for reward-guided generation using latent optimization with Gaussian noise constraints, improving speed and reliability over existing approaches.
Contribution
It proposes a hard noise constraint via projected gradient ascent for reward-guided generation, reducing reward hacking and computational overhead.
Findings
Achieves comparable aesthetic scores with 30% less time than state-of-the-art methods.
Effectively prevents reward hacking during generation.
Maintains noise-like latent vectors throughout optimization.
Abstract
We propose a constrained latent optimization method for reward-guided generation that preserves white Gaussian noise characteristics with negligible overhead. Test-time latent optimization can unlock substantially better reward-guided generations from pretrained generative models, but it is prone to reward hacking that degrades quality and also too slow for practical use. In this work, we make test-time optimization both efficient and reliable by replacing soft regularization with hard white Gaussian noise constraints enforced via projected gradient ascent. Our method applies a closed-form projection after each update to keep the latent vector explicitly noise-like throughout optimization, preventing the drift that leads to unrealistic artifacts. This enforcement adds minimal cost: the projection matches the complexity of standard algorithms such as sorting or FFT and does…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
