In-Context Learning of Linear Systems: Generalization Theory and Applications to Operator Learning
Frank Cole, Yulong Lu, Wuzhe Xu, Tianhao Zhang

TL;DR
This paper provides theoretical analysis and bounds for in-context learning of linear systems using transformers, highlighting the importance of task diversity for generalization under distribution shifts and demonstrating applications to operator learning for PDEs.
Contribution
It introduces neural scaling laws for in-domain generalization, a novel task diversity concept for out-of-domain shifts, and applies these ideas to operator learning in PDEs.
Findings
Neural scaling laws bound generalization error based on task and sample sizes.
Task diversity is key for transformers to generalize under distribution shifts.
Numerical experiments validate the theoretical bounds.
Abstract
We study theoretical guarantees for solving linear systems in-context using a linear transformer architecture. For in-domain generalization, we provide neural scaling laws that bound the generalization error in terms of the number of tasks and sizes of samples used in training and inference. For out-of-domain generalization, we find that the behavior of trained transformers under task distribution shifts depends crucially on the distribution of the tasks seen during training. We introduce a novel notion of task diversity and show that it defines a necessary and sufficient condition for pre-trained transformers generalize under task distribution shifts. We also explore applications of learning linear systems in-context, such as to in-context operator learning for PDEs. Finally, we provide some numerical experiments to validate the established theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
