IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive   Control

Rohan Chitnis; Yingchen Xu; Bobak Hashemi; Lucas Lehnert; Urun Dogan,; Zheqing Zhu; Olivier Delalleau

arXiv:2306.00867·cs.LG·May 17, 2024·1 cites

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Rohan Chitnis, Yingchen Xu, Bobak Hashemi, Lucas Lehnert, Urun Dogan,, Zheqing Zhu, Olivier Delalleau

PDF

Open Access

TL;DR

This paper introduces IQL-TD-MPC, a hierarchical offline model-based RL method that improves long-horizon sparse-reward task performance by planning with implicit Q-learning and intent embeddings.

Contribution

It extends TD-MPC with IQL for better long-term planning and proposes a hierarchical framework using IQL-TD-MPC as a manager to enhance offline RL performance.

Findings

01

Significant performance improvements on D4RL benchmarks.

02

Hierarchical approach with intent embeddings boosts offline RL algorithms.

03

Achieves high scores where baseline methods fail.

Abstract

Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model of the environment can alleviate this issue. In this paper, we make two key contributions: 1) we introduce an offline model-based RL algorithm, IQL-TD-MPC, that extends the state-of-the-art Temporal Difference Learning for Model Predictive Control (TD-MPC) with Implicit Q-Learning (IQL); 2) we propose to use IQL-TD-MPC as a Manager in a hierarchical setting with any off-the-shelf offline RL algorithm as a Worker. More specifically, we pre-train a temporally abstract IQL-TD-MPC Manager to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques

MethodsQ-Learning