Enhanced Transformer architecture for in-context learning of dynamical systems
Matteo Rufolo, Dario Piga, Gabriele Maroni, Marco Forgione

TL;DR
This paper improves an in-context learning approach for dynamical systems by introducing a probabilistic framework, handling non-contiguous data, and using recurrent patching, resulting in better performance and scalability.
Contribution
The paper presents a novel enhanced Transformer architecture with probabilistic modeling, non-contiguous context management, and recurrent patching for improved in-context learning of dynamical systems.
Findings
Demonstrates improved accuracy on Wiener-Hammerstein systems
Shows enhanced scalability with long context sequences
Validates effectiveness through numerical experiments
Abstract
Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Control Systems and Identification
MethodsActivation Patching
