Efficient Planning with Latent Diffusion

Wenhao Li

arXiv:2310.00311·cs.LG·October 3, 2023

Efficient Planning with Latent Diffusion

Wenhao Li

PDF

Open Access 3 Reviews

TL;DR

This paper introduces LatentDiffuser, a novel framework using score-based diffusion models for continuous latent action space planning in offline reinforcement learning, improving efficiency and flexibility especially in high-dimensional tasks.

Contribution

It proposes a unified approach for continuous latent action space learning and planning, establishing theoretical equivalence with energy-guided sampling and introducing a sequence-level exact sampling method.

Findings

01

Competitive performance on low-dimensional tasks

02

Outperforms existing methods on high-dimensional tasks

03

Efficient and flexible planning in continuous latent spaces

Abstract

Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning, mainly when dealing with domains that involve temporally extended tasks and delayed sparse rewards. Existing methods typically plan in the raw action space and can be inefficient and inflexible. Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support and decoupling the temporal structure between planning and modeling. However, current latent-action-based methods are limited to discrete spaces and require expensive planning. This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models. We establish the theoretical equivalence between planning in the latent action space and energy-guided sampling with a pretrained…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 2

Strengths

1. The paper is well presented. The algorithm is straightforward and easy to understand. 2. The paper includes some connection to the theoretical analysis.

Weaknesses

1. The paper could be stronger if the author could provide some visualization/analysis what the latent diffusers learned

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

- The authors demonstrate improvement in the area where they expect their model to do well - The paper is well written - Elements seem to fit well together

Weaknesses

- While I think this is not a very big deal, the paper relies mostly on existing methodologies

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 2

Strengths

This paper tackles offline RL tasks that require temporal abstraction, and proposes the use of latent action representations and latent actions for planning based on a diffusion based model. The task is formulated as a conditional diffusion problem, conditioning on returns. The paper seems to be well derived from an algorithmic viewpoint, and is one of the early works that seem to do planning based on a latent diffusion based approach. However, in its current form the paper is quite difficult to

Weaknesses

The paper repeatedly uses terms like the latent action representation and latent action planning, without a carefully derived definition of it. For self consistency, it would be helpful to define these terms more concretely; otherwise, in its current format, the contributions of the paper can be hard to follow. I can see that the paper draws inspiration from the TAP paper (Jiang et al., 2023) and integrates a diffuser based latent step within this framework. The paper in its current form can b

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsDiffusion