Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models

Yeongmin Kim; Donghyeok Shin; Byeonghu Na; Minsang Park; Richard Lee Kim; Il-Chul Moon

arXiv:2602.03211·cs.LG·February 4, 2026

Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models

Yeongmin Kim, Donghyeok Shin, Byeonghu Na, Minsang Park, Richard Lee Kim, Il-Chul Moon

PDF

Open Access

TL;DR

This paper introduces LiDAR sampling, a test-time scaling method for diffusion models that efficiently guides sample generation towards human-aligned rewards using a novel closed-form guidance computation and lookahead sampling, achieving high performance with less computation.

Contribution

It proposes a new EFR formulation that enables closed-form guidance without neural backpropagation and introduces LiDAR sampling with lookahead to improve efficiency and reward alignment.

Findings

01

LiDAR achieves similar performance to gradient guidance methods with fewer samples.

02

Steep performance gains with increased lookahead accuracy and sample count.

03

LiDAR reaches SDXL GenEval performance with 3 samples and 3-step lookahead, 9.5x faster.

Abstract

Diffusion models have demonstrated strong generative performance; however, generated samples often fail to fully align with human intent. This paper studies a test-time scaling method that enables sampling from regions with higher human-aligned reward values. Existing gradient guidance methods approximate the expected future reward (EFR) at an intermediate particle $x_{t}$ using a Taylor approximation, but this approximation at each time step incurs high computational cost due to sequential neural backpropagation. We show that the EFR at any $x_{t}$ can be computed using only marginal samples from a pre-trained diffusion model. The proposed EFR formulation detaches the neural dependency between $x_{t}$ and the EFR, enabling closed-form guidance computation without neural backpropagation. To further improve efficiency, we introduce lookahead sampling to collect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques