On Feature Collapse and Deep Kernel Learning for Single Forward Pass Uncertainty
Joost van Amersfoort, Lewis Smith, Andrew Jesson, Oscar Key, Yarin Gal

TL;DR
This paper identifies why Deep Kernel Learning struggles with uncertainty estimation and proposes a bi-Lipschitz constraint to improve feature space quality, resulting in a model that outperforms previous methods in uncertainty accuracy while maintaining efficiency.
Contribution
The paper introduces a bi-Lipschitz constraint for DKL's feature extractor, enhancing uncertainty estimation and proposing the DUE model that combines speed with improved uncertainty quality.
Findings
DUE outperforms previous DKL methods in uncertainty estimation.
Constraining DKL's feature extractor improves the feature space for Gaussian processes.
The proposed method maintains neural network speed and accuracy.
Abstract
Inducing point Gaussian process approximations are often considered a gold standard in uncertainty estimation since they retain many of the properties of the exact GP and scale to large datasets. A major drawback is that they have difficulty scaling to high dimensional inputs. Deep Kernel Learning (DKL) promises a solution: a deep feature extractor transforms the inputs over which an inducing point Gaussian process is defined. However, DKL has been shown to provide unreliable uncertainty estimates in practice. We study why, and show that with no constraints, the DKL objective pushes "far-away" data points to be mapped to the same features as those of training-set points. With this insight we propose to constrain DKL's feature extractor to approximately preserve distances through a bi-Lipschitz constraint, resulting in a feature space favorable to DKL. We obtain a model, DUE, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms · Machine Learning and Data Classification
MethodsDeep Kernel Learning · High-Order Consensuses · Softmax
