Does Your Neural Network Extrapolate? Feature Engineering as Identifiability Bias for OOD Generalization
Leonel Aguilar, Jan Nagler, Christoph Hoelscher, Nino Antulov-Fantulin

TL;DR
This paper investigates why deep neural networks often fail to generalize out-of-distribution and shows that feature engineering and model commitments are crucial for successful OOD extrapolation.
Contribution
It introduces the concept of structural commitments in models that determine OOD generalization and demonstrates how explicit feature choices can enable reliable extrapolation.
Findings
OOD extrapolation is non-identifiable from a single training window.
Correct feature commitments can eliminate OOD error.
Transforming features like Fourier coordinates improves extrapolation in various scientific tasks.
Abstract
Successful deep neural networks discover salient features of data. We show when and why they fail to learn out-of-distribution (OOD)-relevant representations from an in-distribution (ID) training window. This requires decoupling feature learning from data-generating-process (DGP) identifiability. From a single training window, OOD extrapolation is non-identifiable: infinitely many DGPs are -observationally equivalent on the training data but diverge arbitrarily outside it, and no in-distribution criterion alone reliably breaks the tie. A structural commitment, the feature map, label map, and model class , dictates the assumed DGP and governs OOD generalization while leaving ID performance essentially unchanged. When architecture, pretraining, augmentation, input formats, or domain knowledge implicitly inject the missing commitment, the model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
