DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive   Revaluation

Jaehyun Park; Yunho Kim; Sejin Kim; Byung-Jun Lee; Sundong; Kim

arXiv:2410.11338·cs.LG·October 16, 2024

DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Jaehyun Park, Yunho Kim, Sejin Kim, Byung-Jun Lee, Sundong, Kim

PDF

Open Access

TL;DR

DIAR is a novel offline reinforcement learning framework that uses diffusion models and adaptive revaluation to improve decision-making, robustness, and generalization in long-horizon, sparse-reward tasks.

Contribution

The paper introduces DIAR, integrating diffusion models with implicit Q-learning and adaptive revaluation for enhanced offline RL performance.

Findings

01

Outperforms state-of-the-art algorithms in Maze2D, AntMaze, and Kitchen tasks.

02

Effectively handles out-of-distribution samples and long-horizon problems.

03

Improves policy robustness and generalization through diverse latent trajectories.

Abstract

We propose a novel offline reinforcement learning (offline RL) approach, introducing the Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation (DIAR) framework. We address two key challenges in offline RL: out-of-distribution samples and long-horizon problems. We leverage diffusion models to learn state-action sequence distributions and incorporate value functions for more balanced and adaptive decision-making. DIAR introduces an Adaptive Revaluation mechanism that dynamically adjusts decision lengths by comparing current and future state values, enabling flexible long-term decision-making. Furthermore, we address Q-value overestimation by combining Q-network learning with a value function guided by a diffusion model. The diffusion model generates diverse latent trajectories, enhancing policy robustness and generalization. As demonstrated in tasks like Maze2D, AntMaze,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications

MethodsDiffusion · Q-Learning