Teacher Forcing as Generalized Bayes: Optimization Geometry Mismatch in Switching Surrogates for Chaotic Dynamics
Andre Herz, Daniel Durstewitz, Georgia Koppe

TL;DR
This paper investigates the geometric mismatch between teacher forcing and marginal likelihood in training recurrent neural networks for chaotic systems, proposing insights into their effects on model training and dynamical accuracy.
Contribution
It introduces a probabilistic switching augmentation to compare the objective geometries of ITF and marginal likelihood, revealing how conditioning affects curvature and model performance.
Findings
Conditioning on a single regime inflates curvature in the switching setting.
Marginal likelihood curvature is reduced by a missing-information correction.
Windowed evidence fine-tuning improves held-out evidence but may harm dynamical quantities.
Abstract
Identity teacher forcing (ITF) enables stable training of deterministic recurrent surrogates for chaotic dynamical systems and has been highly effective for dynamical systems reconstruction (DSR) with recurrent neural networks (RNNs), including interpretable almost-linear RNNs (AL-RNNs). However, as an intervention-based prediction loss (and thus a generalized Bayes update), teacher forcing need not match the free-running model's marginal likelihood geometry. We compare the objective-induced curvatures of ITF and marginal likelihood in a probabilistic switching augmentation of AL-RNNs, estimating ambiguity-aware observed information via Louis' identity. In the switching setting studied here, conditioning on a single forced regime path (as ITF does) inflates curvature, while marginal likelihood curvature is reduced by a missing-information correction when multiple switching explanations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
