Understanding the Training and Generalization of Pretrained Transformer   for Sequential Decision Making

Hanzhao Wang; Yu Pan; Fupeng Sun; Shang Liu; Kalyan Talluri; Guanting; Chen; Xiaocheng Li

arXiv:2405.14219·cs.LG·October 3, 2024

Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making

Hanzhao Wang, Yu Pan, Fupeng Sun, Shang Liu, Kalyan Talluri, Guanting, Chen, Xiaocheng Li

PDF

Open Access

TL;DR

This paper studies pre-trained transformers for certain sequential decision-making problems, revealing their training as performative prediction, proposing improvements for out-of-distribution issues, and demonstrating advantages over traditional algorithms in specific scenarios.

Contribution

It introduces a new training approach incorporating transformer-generated actions, providing theoretical and numerical insights into their performance and exploration behavior.

Findings

01

Transformer training can be viewed as performative prediction.

02

Including generated actions improves training robustness.

03

Pre-trained transformers outperform structured algorithms in short horizons.

Abstract

In this paper, we consider the supervised pre-trained transformer for a class of sequential decision-making problems. The class of considered problems is a subset of the general formulation of reinforcement learning in that there is no transition probability matrix; though seemingly restrictive, the subset class of problems covers bandits, dynamic pricing, and newsvendor problems as special cases. Such a structure enables the use of optimal actions/decisions in the pre-training phase, and the usage also provides new insights for the training and generalization of the pre-trained transformer. We first note the training of the transformer model can be viewed as a performative prediction problem, and the existing methods and theories largely ignore or cannot resolve an out-of-distribution issue. We propose a natural solution that includes the transformer-generated action sequences in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications