Fast and Stable Diffusion Planning through Variational Adaptive Weighting
Zhiying Qiu, Tao Lin

TL;DR
This paper introduces a variationally optimal, uncertainty-aware loss weighting method for diffusion models in offline reinforcement learning, significantly reducing training time while maintaining competitive performance.
Contribution
It proposes a novel closed-form polynomial approximation for online estimation of optimal loss weights, improving training stability and efficiency in diffusion-based offline RL.
Findings
Achieves up to 10x fewer training steps
Maintains competitive performance on standard benchmarks
Demonstrates stability and efficiency improvements
Abstract
Diffusion models have recently shown promise in offline RL. However, these methods often suffer from high training costs and slow convergence, particularly when using transformer-based denoising backbones. While several optimization strategies have been proposed -- such as modified noise schedules, auxiliary prediction targets, and adaptive loss weighting -- challenges remain in achieving stable and efficient training. In particular, existing loss weighting functions typically rely on neural network approximators, which can be ineffective in early training phases due to limited generalization capacity of MLPs when exposed to sparse feedback in the early training stages. In this work, we derive a variationally optimal uncertainty-aware weighting function and introduce a closed-form polynomial approximation method for its online estimation under the flow-based generative modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
