Skill Decision Transformer

Shyam Sudhakaran; Sebastian Risi

arXiv:2301.13573·cs.LG·February 1, 2023·1 cites

Skill Decision Transformer

Shyam Sudhakaran, Sebastian Risi

PDF

Open Access

TL;DR

The paper introduces Skill Decision Transformer, a method that discovers diverse primitive behaviors from offline data using reward-free optimization, improving behavior diversity and performance in offline RL tasks.

Contribution

It presents Skill Decision Transformer, a novel approach combining hindsight relabelling and skill discovery for diverse behavior extraction in offline RL.

Findings

01

Skill DT can perform offline state-marginal matching.

02

It discovers descriptive, easily sampleable skills.

03

It remains competitive with supervised offline RL methods.

Abstract

Recent work has shown that Large Language Models (LLMs) can be incredibly effective for offline reinforcement learning (RL) by representing the traditional RL problem as a sequence modelling problem (Chen et al., 2021; Janner et al., 2021). However many of these methods only optimize for high returns, and may not extract much information from a diverse dataset of trajectories. Generalized Decision Transformers (GDTs) (Furuta et al., 2021) have shown that utilizing future trajectory information, in the form of information statistics, can help extract more information from offline trajectory data. Building upon this, we propose Skill Decision Transformer (Skill DT). Skill DT draws inspiration from hindsight relabelling (Andrychowicz et al., 2017) and skill discovery methods to discover a diverse set of primitive behaviors, or skills. We show that Skill DT can not only perform offline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Multi-Head Attention · Residual Connection · Dense Connections · Position-Wise Feed-Forward Layer