Delay Embedding Theory of Neural Sequence Models

Mitchell Ostrow; Adam Eisen; Ila Fiete

arXiv:2406.11993·cs.LG·June 19, 2024·1 cites

Delay Embedding Theory of Neural Sequence Models

Mitchell Ostrow, Adam Eisen, Ila Fiete

PDF

Open Access

TL;DR

This paper investigates whether neural sequence models, like transformers and state-space models, can reconstruct unobserved dynamics from partial observations, linking delay embedding theory with deep learning.

Contribution

It demonstrates that sequence models can learn delay embeddings of underlying systems, with state-space models showing stronger initial reconstruction and efficiency.

Findings

01

State-space models more effectively reconstruct unobserved dynamics at initialization.

02

Sequence layers can learn viable embeddings of the underlying system.

03

State-space models achieve lower error on dynamics tasks.

Abstract

To generate coherent responses, language models infer unobserved meaning from their input text sequence. One potential explanation for this capability arises from theories of delay embeddings in dynamical systems, which prove that unobserved variables can be recovered from the history of only a handful of observed variables. To test whether language models are effectively constructing delay embeddings, we measure the capacities of sequence models to reconstruct unobserved dynamics. We trained 1-layer transformer decoders and state-space sequence models on next-step prediction from noisy, partially-observed time series data. We found that each sequence layer can learn a viable embedding of the underlying system. However, state-space models have a stronger inductive bias than transformers-in particular, they more effectively reconstruct unobserved information at initialization, leading to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications