Diffusion Self-Weighted Guidance for Offline Reinforcement Learning
Augusto Tagle, Javier Ruiz-del-Solar, Felipe Tobar

TL;DR
This paper introduces Self-Weighted Guidance (SWG), a diffusion-based method for offline reinforcement learning that simplifies score computation by integrating weights directly into the diffusion model, achieving competitive results.
Contribution
The paper proposes a novel diffusion model framework that directly incorporates weight functions for offline RL, eliminating the need for additional network training.
Findings
SWG generates desired action distributions in toy examples.
SWG performs comparably to state-of-the-art on D4RL benchmarks.
Ablation studies validate the scalability and effectiveness of SWG.
Abstract
Offline reinforcement learning (RL) recovers the optimal policy given historical observations of an agent. In practice, is modeled as a weighted version of the agent's behavior policy , using a weight function working as a critic of the agent's behavior. Though recent approaches to offline RL based on diffusion models have exhibited promising results, the computation of the required scores is challenging due to their dependence on the unknown . In this work, we alleviate this issue by constructing a diffusion over both the actions and the weights. With the proposed setting, the required scores are directly obtained from the diffusion model without learning extra networks. Our main conceptual contribution is a novel guidance method, where guidance (which is a function of ) comes from the same diffusion model, therefore, our proposal is termed Self-Weighted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control
MethodsDiffusion
