Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet, Aggarwal

TL;DR
This paper systematically reviews the application of deep generative models in offline policy learning, covering five main types and their use in offline reinforcement learning and imitation learning, highlighting current progress and future directions.
Contribution
It provides the first comprehensive review of DGM-based offline policy learning, categorizing methods, analyzing development trends, and offering perspectives on future research.
Findings
Analyzed five mainstream deep generative models in offline policy learning.
Categorized existing works based on DGM usage and development stages.
Discussed future research directions and challenges in the field.
Abstract
Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPolicy Transfer and Learning
MethodsDiffusion · Normalizing Flows
