Variance Reduction for Expectations with Diffusion Teachers
Jesse Bettencourt, Xindi Wu, Matan Atzmon, James Lucas, Jonathan Lorraine

TL;DR
The paper introduces CARV, a variance reduction framework for Monte Carlo expectations in diffusion model pipelines, significantly improving compute efficiency and reducing estimator variance.
Contribution
CARV employs hierarchical Monte Carlo estimators with importance sampling and stratification to reduce variance and improve compute efficiency in diffusion model applications.
Findings
CARV achieves 2-3x effective compute improvements in text-to-3D distillation and attribution.
Variance reduction techniques cut gradient variance by an order of magnitude in single-step distillation.
Most of the compute gains come from amortized reuse of upstream computations.
Abstract
Pretrained diffusion models serve as frozen teachers feeding downstream pipelines such as text-to-3D, single-step distillation, and data attribution. The teacher gradients these pipelines consume are Monte Carlo (MC) expectations over noise levels and Gaussian noise samples; their estimator variance dominates compute cost because each draw requires expensive upstream work (rendering, simulation, encoding). We introduce CARV, a compute-aware variance-accounting framework that motivates a hierarchical MC estimator: amortize the expensive upstream computation over cheap diffusion-noise resamples, sharpened by timestep importance sampling and a stratified-inverse-CDF construction. In our text-to-3D distillation and attribution experiments, CARV delivers 2-3x effective compute multipliers (most from amortized reuse; ~25% additional from IS+stratification) without changing the objective; in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
