ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise   Optimization

Luca Eyring; Shyamgopal Karthik; Karsten Roth; Alexey Dosovitskiy,; Zeynep Akata

arXiv:2406.04312·cs.CV·November 1, 2024

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Luca Eyring, Shyamgopal Karthik, Karsten Roth, Alexey Dosovitskiy,, Zeynep Akata

PDF

Open Access 1 Repo 1 Video

TL;DR

ReNO introduces a novel inference-time optimization method for text-to-image models that uses reward signals to improve image quality and detail, outperforming existing open-source models within seconds.

Contribution

The paper proposes Reward-based Noise Optimization (ReNO), a new inference-time technique that enhances T2I models using reward signals, addressing limitations of fine-tuning approaches.

Findings

01

ReNO improves performance of one-step T2I models on benchmarks.

02

ReNO-enhanced models outperform open-source models within 20-50 seconds.

03

User studies favor ReNO models nearly twice as often as SDXL.

Abstract

Text-to-Image (T2I) models have made significant advancements in recent years, but they still struggle to accurately capture intricate details specified in complex compositional prompts. While fine-tuning T2I models with reward objectives has shown promise, it suffers from "reward hacking" and may not generalize well to unseen prompt distributions. In this work, we propose Reward-based Noise Optimization (ReNO), a novel approach that enhances T2I models at inference by optimizing the initial noise based on the signal from one or multiple human preference reward models. Remarkably, solving this optimization problem with gradient ascent for 50 iterations yields impressive results on four different one-step models across two competitive benchmarks, T2I-CompBench and GenEval. Within a computational budget of 20-50 seconds, ReNO-enhanced one-step models consistently surpass the performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explainableml/reno
pytorchOfficial

Videos

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization· slideslive

Taxonomy

TopicsHandwritten Text Recognition Techniques

MethodsDiffusion