A variational approximate posterior for the deep Wishart process
Sebastian W. Ober, Laurence Aitchison

TL;DR
This paper introduces a novel variational approximate posterior for the deep Wishart process, enabling flexible inference and demonstrating improved performance over deep Gaussian processes with similar priors.
Contribution
It develops a new distribution over positive semi-definite matrices and a variational inference scheme for the deep Wishart process, facilitating layer-dependent inference.
Findings
Inference in DWP improves performance over DGP with similar priors.
Introduces a flexible distribution over positive semi-definite matrices.
Develops a doubly-stochastic inducing-point inference scheme.
Abstract
Recent work introduced deep kernel processes as an entirely kernel-based alternative to NNs (Aitchison et al. 2020). Deep kernel processes flexibly learn good top-layer representations by alternately sampling the kernel from a distribution over positive semi-definite matrices and performing nonlinear transformations. A particular deep kernel process, the deep Wishart process (DWP), is of particular interest because its prior can be made equivalent to deep Gaussian process (DGP) priors for kernels that can be expressed entirely in terms of Gram matrices. However, inference in DWPs has not yet been possible due to the lack of sufficiently flexible distributions over positive semi-definite matrices. Here, we give a novel approach to obtaining flexible distributions over positive semi-definite matrices by generalising the Bartlett decomposition of the Wishart probability density. We use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Statistical Methods and Models · Statistical Methods and Bayesian Inference
MethodsGaussian Process
