Skill Decision Transformer
Shyam Sudhakaran, Sebastian Risi

TL;DR
The paper introduces Skill Decision Transformer, a method that discovers diverse primitive behaviors from offline data using reward-free optimization, improving behavior diversity and performance in offline RL tasks.
Contribution
It presents Skill Decision Transformer, a novel approach combining hindsight relabelling and skill discovery for diverse behavior extraction in offline RL.
Findings
Skill DT can perform offline state-marginal matching.
It discovers descriptive, easily sampleable skills.
It remains competitive with supervised offline RL methods.
Abstract
Recent work has shown that Large Language Models (LLMs) can be incredibly effective for offline reinforcement learning (RL) by representing the traditional RL problem as a sequence modelling problem (Chen et al., 2021; Janner et al., 2021). However many of these methods only optimize for high returns, and may not extract much information from a diverse dataset of trajectories. Generalized Decision Transformers (GDTs) (Furuta et al., 2021) have shown that utilizing future trajectory information, in the form of information statistics, can help extract more information from offline trajectory data. Building upon this, we propose Skill Decision Transformer (Skill DT). Skill DT draws inspiration from hindsight relabelling (Andrychowicz et al., 2017) and skill discovery methods to discover a diverse set of primitive behaviors, or skills. We show that Skill DT can not only perform offline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Multi-Head Attention · Residual Connection · Dense Connections · Position-Wise Feed-Forward Layer
