Transformers as Implicit State Estimators: In-Context Learning in Dynamical Systems
Usman Akram, Haris Vikalo

TL;DR
This paper demonstrates that transformers can implicitly perform state estimation in dynamical systems through in-context learning, matching traditional filters in linear cases and approaching their performance in nonlinear regimes without explicit model knowledge.
Contribution
It introduces a novel perspective that transformers can serve as implicit, non-parametric state estimators for dynamical systems via in-context learning, bypassing traditional filtering methods.
Findings
Transformers accurately predict outputs using in-context learning without explicit model updates.
Performance closely matches Kalman filter in linear-Gaussian systems.
Approaches EKF and PF performance in nonlinear regimes.
Abstract
Predicting the behavior of a dynamical system from noisy observations of its past outputs is a classical problem encountered across engineering and science. For linear systems with Gaussian inputs, the Kalman filter -- the best linear minimum mean-square error estimator of the state trajectory -- is optimal in the Bayesian sense. For nonlinear systems, Bayesian filtering is typically approached using suboptimal heuristics such as the Extended Kalman Filter (EKF), or numerical methods such as particle filtering (PF). In this work, we show that transformers, employed in an in-context learning (ICL) setting, can implicitly infer hidden states in order to predict the outputs of a wide family of dynamical systems, without test-time gradient updates or explicit knowledge of the system model. Specifically, when provided with a short context of past input-output pairs and, optionally, system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
