Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making
Hanzhao Wang, Yu Pan, Fupeng Sun, Shang Liu, Kalyan Talluri, Guanting, Chen, Xiaocheng Li

TL;DR
This paper studies pre-trained transformers for certain sequential decision-making problems, revealing their training as performative prediction, proposing improvements for out-of-distribution issues, and demonstrating advantages over traditional algorithms in specific scenarios.
Contribution
It introduces a new training approach incorporating transformer-generated actions, providing theoretical and numerical insights into their performance and exploration behavior.
Findings
Transformer training can be viewed as performative prediction.
Including generated actions improves training robustness.
Pre-trained transformers outperform structured algorithms in short horizons.
Abstract
In this paper, we consider the supervised pre-trained transformer for a class of sequential decision-making problems. The class of considered problems is a subset of the general formulation of reinforcement learning in that there is no transition probability matrix; though seemingly restrictive, the subset class of problems covers bandits, dynamic pricing, and newsvendor problems as special cases. Such a structure enables the use of optimal actions/decisions in the pre-training phase, and the usage also provides new insights for the training and generalization of the pre-trained transformer. We first note the training of the transformer model can be viewed as a performative prediction problem, and the existing methods and theories largely ignore or cannot resolve an out-of-distribution issue. We propose a natural solution that includes the transformer-generated action sequences in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
