On Kernel Regression with Data-Dependent Kernels
James B. Simon

TL;DR
This paper explores kernel regression where the kernel adapts based on training data, showing that using the posterior of the target function yields an optimal data-dependent kernel, with implications for neural networks.
Contribution
It introduces the concept of data-dependent kernels in kernel regression and identifies the posterior of the target function as the optimal choice after data observation.
Findings
Posterior-based kernels are optimal in data-dependent kernel regression.
Connections established between data-dependent kernels and deep neural networks.
Highlights the difference from fixed kernel assumptions in traditional KR.
Abstract
The primary hyperparameter in kernel regression (KR) is the choice of kernel. In most theoretical studies of KR, one assumes the kernel is fixed before seeing the training data. Under this assumption, it is known that the optimal kernel is equal to the prior covariance of the target function. In this note, we consider KR in which the kernel may be updated after seeing the training data. We point out that an analogous choice of kernel using the posterior of the target function is optimal in this setting. Connections to the view of deep neural networks as data-dependent kernel learners are discussed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Gaussian Processes and Bayesian Inference
