DiffCPS: Diffusion Model based Constrained Policy Search for Offline   Reinforcement Learning

Longxiang He; Li Shen; Linrui Zhang; Junbo Tan; Xueqian Wang

arXiv:2310.05333·cs.LG·February 29, 2024

DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning

Longxiang He, Li Shen, Linrui Zhang, Junbo Tan, Xueqian Wang

PDF

Open Access 1 Repo

TL;DR

DiffCPS introduces a diffusion model-based approach for constrained policy search in offline reinforcement learning, overcoming limitations of Gaussian policies and enabling better policy expressivity with theoretical guarantees.

Contribution

The paper proposes DiffCPS, a novel diffusion model-based constrained policy search method with a primal-dual framework and theoretical analysis of duality and convergence.

Findings

01

DiffCPS outperforms traditional AWR-based methods on D4RL benchmarks.

02

DiffCPS achieves competitive or superior results compared to recent diffusion-based offline RL methods.

03

Theoretical analysis confirms strong duality and convergence properties of DiffCPS.

Abstract

Constrained policy search (CPS) is a fundamental problem in offline reinforcement learning, which is generally solved by advantage weighted regression (AWR). However, previous methods may still encounter out-of-distribution actions due to the limited expressivity of Gaussian-based policies. On the other hand, directly applying the state-of-the-art models with distribution expression capabilities (i.e., diffusion models) in the AWR framework is intractable since AWR requires exact policy probability densities, which is intractable in diffusion models. In this paper, we propose a novel approach, $Diffusion-based Constrained Policy Search$ (dubbed DiffCPS), which tackles the diffusion-based constrained policy search with the primal-dual method. The theoretical analysis reveals that strong duality holds for diffusion-based CPS problems, and upon introducing parameter approximation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

felix-thu/DiffCPS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification

MethodsDiffusion