TL;DR
This paper introduces a conditional Deep Gaussian Process model that combines deep hierarchical structure with Bayesian hyperdata learning, improving expressiveness and reducing overfitting compared to traditional deep kernel learning.
Contribution
It proposes a novel conditional DGP with hyperdata support, optimized via empirical Bayes, and demonstrates its equivalence to deep kernel learning in dense hyperdata limits.
Findings
Preliminary results show enhanced expressiveness with depth.
Hyperdata learning improves model flexibility.
Model offers a more Bayesian alternative to deep kernel learning.
Abstract
It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success in adopting a deep network for feature extraction followed by a GP used as function model. Recently,it was suggested that, albeit training with marginal likelihood, the deterministic nature of feature extractor might lead to overfitting while the replacement with a Bayesian network seemed to cure it. Here, we propose the conditional Deep Gaussian Process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. We follow our previous moment matching approach to approximate the marginal prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsVariational Inference · Greedy Policy Search · Gaussian Process
