Does Your Neural Network Extrapolate? Feature Engineering as Identifiability Bias for OOD Generalization

Leonel Aguilar; Jan Nagler; Christoph Hoelscher; Nino Antulov-Fantulin

arXiv:2605.07483·cs.LG·May 14, 2026

Does Your Neural Network Extrapolate? Feature Engineering as Identifiability Bias for OOD Generalization

Leonel Aguilar, Jan Nagler, Christoph Hoelscher, Nino Antulov-Fantulin

PDF

TL;DR

This paper investigates why deep neural networks often fail to generalize out-of-distribution and shows that feature engineering and model commitments are crucial for successful OOD extrapolation.

Contribution

It introduces the concept of structural commitments in models that determine OOD generalization and demonstrates how explicit feature choices can enable reliable extrapolation.

Findings

01

OOD extrapolation is non-identifiable from a single training window.

02

Correct feature commitments can eliminate OOD error.

03

Transforming features like Fourier coordinates improves extrapolation in various scientific tasks.

Abstract

Successful deep neural networks discover salient features of data. We show when and why they fail to learn out-of-distribution (OOD)-relevant representations from an in-distribution (ID) training window. This requires decoupling feature learning from data-generating-process (DGP) identifiability. From a single training window, OOD extrapolation is non-identifiable: infinitely many DGPs are $ε$ -observationally equivalent on the training data but diverge arbitrarily outside it, and no in-distribution criterion alone reliably breaks the tie. A structural commitment, the feature map, label map, and model class $(φ, ψ, M)$ , dictates the assumed DGP and governs OOD generalization while leaving ID performance essentially unchanged. When architecture, pretraining, augmentation, input formats, or domain knowledge implicitly inject the missing commitment, the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.