One for All: A Non-Linear Transformer can Enable Cross-Domain Generalization for In-Context Reinforcement Learning
Bowen He, Juncheng Dong, Lin Lin, Xiang Cheng

TL;DR
This paper explores how non-linear transformers can facilitate cross-domain generalization in in-context reinforcement learning by interpreting them as kernel-based regression in RKHS, supported by theoretical insights and experiments.
Contribution
It introduces a kernel-based perspective of transformers in RL, linking them to RKHS regression, and demonstrates their ability to generalize across domains.
Findings
Transformers can be viewed as performing regression in RKHS.
Value functions across domains can share weights within the same RKHS.
Experiments show convergence of the temporal-difference objective in multiple domains.
Abstract
A central challenge in reinforcement learning (RL) is to learn models that generalize beyond the tasks on which they are trained, a goal traditionally pursued through multi-task and meta RL. Recently, transformer architectures have emerged as a promising approach, enabling adaptation to new tasks via in-context learning without explicit parameter updates. From a functional perspective, a transformer can be viewed as a functional operator that maps a context to a task-specific function. It is thus fundamental to understand and design this operator to support stronger generalization in RL. In this work, we address this resulting question of generalization from a kernel-based perspective by establishing a connection between non-linear transformers and kernel-based temporal difference learning. By interpreting the transformer as performing regression in a Reproducing Kernel Hilbert Space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
