Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage

Peiyu Yu; Suraj Kothawade; Sirui Xie; Ying Nian Wu; Hongliang Fei

arXiv:2511.22177·cs.LG·April 28, 2026

Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage

Peiyu Yu, Suraj Kothawade, Sirui Xie, Ying Nian Wu, Hongliang Fei

PDF

TL;DR

This paper introduces a novel instance-level sampling schedule learning method for text-to-image models, improving alignment and efficiency without retraining the model weights.

Contribution

It proposes a Dirichlet policy-based approach with a James-Stein based reward baseline for better gradient estimation in policy learning.

Findings

01

Improves text-image alignment and compositional control across multiple models.

02

Achieves comparable quality to distilled models with fewer sampling steps.

03

Demonstrates the method's model-agnostic applicability as a post-training enhancement.

Abstract

Most post-training methods for text-to-image samplers focus on model weights: either fine-tuning the backbone for alignment or distilling it for few-step efficiency. We take a different route: rescheduling the sampling timeline of a frozen sampler. Instead of a fixed, global schedule, we learn instance-level (prompt- and noise-conditioned) schedules through a single-pass Dirichlet policy. To ensure accurate gradient estimates in high-dimensional policy learning, we introduce a novel reward baseline based on a principled James-Stein estimator; it provably achieves lower estimation errors than commonly used variants and leads to superior performance. Our rescheduled samplers consistently improve text-image alignment including text rendering and compositional control across modern Stable Diffusion and Flux model families. Additionally, a 5-step Flux-Dev sampler with our schedules can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.