Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning
Wenhao Zhao, Qiushui Xu, Linjie Xu, Lei Song, Jinyu Wang, Chunlai Zhou, Jiang Bian

TL;DR
This paper analyzes the role of Markov heads in pretrained language models used in offline reinforcement learning, revealing their limitations and proposing a new method GPT2-DTMA to improve long-term environment performance.
Contribution
It identifies Markov heads as a key component in PLMs for RL, proves their fixed nature, and introduces GPT2-DTMA with Mixture of Attention to enhance long-term decision-making.
Findings
Markov heads focus attention on last input token
Extreme attention cannot be altered by re-training or fine-tuning
GPT2-DTMA improves long-term environment performance
Abstract
Recently, incorporating knowledge from pretrained language models (PLMs) into decision transformers (DTs) has generated significant attention in offline reinforcement learning (RL). These PLMs perform well in RL tasks, raising an intriguing question: what kind of knowledge from PLMs has been transferred to RL to achieve such good results? This work first dives into this problem by analyzing each head quantitatively and points out Markov head, a crucial component that exists in the attention heads of PLMs. It leads to extreme attention on the last-input token and performs well only in short-term environments. Furthermore, we prove that this extreme attention cannot be changed by re-training embedding layer or fine-tuning. Inspired by our analysis, we propose a general method GPT2-DTMA, which equips a pretrained DT with Mixture of Attention (MoA), to accommodate diverse attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSoftmax · Attention Is All You Need
