Multi-Objective Decision Transformers for Offline Reinforcement Learning

Abdelghani Ghanem; Philippe Ciblat; Mounir Ghogho

arXiv:2308.16379·cs.LG·September 1, 2023

Multi-Objective Decision Transformers for Offline Reinforcement Learning

Abdelghani Ghanem, Philippe Ciblat, Mounir Ghogho

PDF

Open Access

TL;DR

This paper reformulates offline reinforcement learning as a multi-objective sequence modeling task using transformers, improving attention mechanisms and trajectory representations to enhance policy performance on benchmark tasks.

Contribution

It introduces a multi-objective approach and action space regions to better utilize transformer attention and address trajectory representation issues in offline RL.

Findings

01

Improved transformer attention utilization in offline RL.

02

Enhanced performance on D4RL locomotion benchmarks.

03

Outperforms or matches state-of-the-art methods.

Abstract

Offline Reinforcement Learning (RL) is structured to derive policies from static trajectory data without requiring real-time environment interactions. Recent studies have shown the feasibility of framing offline RL as a sequence modeling task, where the sole aim is to predict actions based on prior context using the transformer architecture. However, the limitation of this single task learning approach is its potential to undermine the transformer model's attention mechanism, which should ideally allocate varying attention weights across different tokens in the input context for optimal prediction. To address this, we reformulate offline RL as a multi-objective optimization problem, where the prediction is extended to states and returns. We also highlight a potential flaw in the trajectory representation used for sequence modeling, which could generate inaccuracies when modeling the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics