Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning

Oliver Dippel; Alexei Lisitsa; Bei Peng

arXiv:2511.10251·cs.LG·November 14, 2025

Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning

Oliver Dippel, Alexei Lisitsa, Bei Peng

PDF

Open Access

TL;DR

This paper introduces Heuristic Transformer, an in-context reinforcement learning method that uses a belief distribution over rewards to improve decision-making, demonstrating superior performance across various environments.

Contribution

The paper proposes Heuristic Transformer, which incorporates a learned belief distribution over rewards into transformer-based reinforcement learning, enhancing decision accuracy and generalization.

Findings

01

HT outperforms baselines in Darkroom, Miniworld, and MuJoCo environments.

02

The belief augmentation improves decision-making effectiveness.

03

The approach bridges belief modeling and transformer decision-making.

Abstract

Transformers have demonstrated exceptional in-context learning (ICL) capabilities, enabling applications across natural language processing, computer vision, and sequential decision-making. In reinforcement learning, ICL reframes learning as a supervised problem, facilitating task adaptation without parameter updates. Building on prior work leveraging transformers for sequential decision-making, we propose Heuristic Transformer (HT), an in-context reinforcement learning (ICRL) approach that augments the in-context dataset with a belief distribution over rewards to achieve better decision-making. Using a variational auto-encoder (VAE), a low-dimensional stochastic variable is learned to represent the posterior distribution over rewards, which is incorporated alongside an in-context dataset and query states as prompt to the transformer policy. We assess the performance of HT across the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning