Consistency Models as a Rich and Efficient Policy Class for   Reinforcement Learning

Zihan Ding; Chi Jin

arXiv:2309.16984·cs.LG·March 18, 2024·2 cites

Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning

Zihan Ding, Chi Jin

PDF

Open Access 1 Repo

TL;DR

This paper introduces the consistency policy, a fast and expressive policy class for reinforcement learning, leveraging consistency models to improve efficiency and performance across offline, offline-to-online, and online settings.

Contribution

The paper proposes the consistency policy as a novel, efficient policy representation based on consistency models, demonstrating its advantages over diffusion policies in RL tasks.

Findings

01

Consistency policy is more computationally efficient than diffusion policy.

02

It achieves comparable or higher performance in various RL settings.

03

Demonstrates effectiveness in offline, offline-to-online, and online RL scenarios.

Abstract

Score-based generative models like the diffusion model have been testified to be effective in modeling multi-modal data from image generation to reinforcement learning (RL). However, the inference process of diffusion model can be slow, which hinders its usage in RL with iterative sampling. We propose to apply the consistency model as an efficient yet expressive policy representation, namely consistency policy, with an actor-critic style algorithm for three typical RL settings: offline, offline-to-online and online. For offline RL, we demonstrate the expressiveness of generative models as policies from multi-modal data. For offline-to-online RL, the consistency policy is shown to be more computational efficient than diffusion policy, with a comparable performance. For online RL, the consistency policy demonstrates significant speedup and even higher average performances than the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

quantumiracle/consistency_model_for_reinforcement_learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Language and cultural evolution

MethodsDiffusion