On Interpolation Formulas Describing Neural Network Generalization
Jin Guo, Roy Y. He, Jean-Michel Morel

TL;DR
This paper extends an interpolation formula for neural networks to stochastic training, revealing that trained models behave like kernel machines with optimizer-specific weights, and links generalization to feature-space memory and kernel evolution.
Contribution
It introduces a stochastic gradient kernel, proves stochastic Domingos theorems, and provides a unified kernel-based interpretation of diffusion models and GANs.
Findings
Neural network outputs can be represented as kernel-weighted feature retrieval.
Training influences the feature geometry and kernel evolution over time.
Generalization depends on the alignment between test points and learned feature memory.
Abstract
In 2020 Domingos introduced an interpolation formula valid for "every model trained by gradient descent". He concluded that such models behave approximately as kernel machines. In this work, we extend the Domingos formula to stochastic training. We introduce a stochastic gradient kernel that extends the deterministic version via a continuous-time diffusion approximation. We prove stochastic Domingos theorems and show that the expected network output admits a kernel-machine representation with optimizer-specific weighting. It reveals that training samples contribute through loss-dependent weights and gradient alignment along the training trajectory. We then link the generalization error to the null space of the integral operator induced by the stochastic gradient kernel. The same path-kernel viewpoint provides a unified interpretation of diffusion models and GANs: diffusion induces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
