Efficient Planning with Latent Diffusion
Wenhao Li

TL;DR
This paper introduces LatentDiffuser, a novel framework using score-based diffusion models for continuous latent action space planning in offline reinforcement learning, improving efficiency and flexibility especially in high-dimensional tasks.
Contribution
It proposes a unified approach for continuous latent action space learning and planning, establishing theoretical equivalence with energy-guided sampling and introducing a sequence-level exact sampling method.
Findings
Competitive performance on low-dimensional tasks
Outperforms existing methods on high-dimensional tasks
Efficient and flexible planning in continuous latent spaces
Abstract
Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning, mainly when dealing with domains that involve temporally extended tasks and delayed sparse rewards. Existing methods typically plan in the raw action space and can be inefficient and inflexible. Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support and decoupling the temporal structure between planning and modeling. However, current latent-action-based methods are limited to discrete spaces and require expensive planning. This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models. We establish the theoretical equivalence between planning in the latent action space and energy-guided sampling with a pretrained…
Peer Reviews
Decision·ICLR 2024 poster
1. The paper is well presented. The algorithm is straightforward and easy to understand. 2. The paper includes some connection to the theoretical analysis.
1. The paper could be stronger if the author could provide some visualization/analysis what the latent diffusers learned
- The authors demonstrate improvement in the area where they expect their model to do well - The paper is well written - Elements seem to fit well together
- While I think this is not a very big deal, the paper relies mostly on existing methodologies
This paper tackles offline RL tasks that require temporal abstraction, and proposes the use of latent action representations and latent actions for planning based on a diffusion based model. The task is formulated as a conditional diffusion problem, conditioning on returns. The paper seems to be well derived from an algorithmic viewpoint, and is one of the early works that seem to do planning based on a latent diffusion based approach. However, in its current form the paper is quite difficult to
The paper repeatedly uses terms like the latent action representation and latent action planning, without a carefully derived definition of it. For self consistency, it would be helpful to define these terms more concretely; otherwise, in its current format, the contributions of the paper can be hard to follow. I can see that the paper draws inspiration from the TAP paper (Jiang et al., 2023) and integrates a diffuser based latent step within this framework. The paper in its current form can b
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsDiffusion
