Residual Deep Gaussian Processes on Manifolds
Kacper Wyrwal, Andreas Krause, Viacheslav Borovitskiy

TL;DR
This paper introduces residual deep Gaussian process models on Riemannian manifolds that effectively model complex manifold-valued data, outperforming shallow models in prediction, uncertainty calibration, and Bayesian optimization tasks.
Contribution
It presents a novel deep Gaussian process framework on manifolds with residual connections, capable of modeling complex data and improving performance over traditional shallow Gaussian processes.
Findings
Significant performance improvements on manifold-valued data.
Enhanced uncertainty calibration and robustness to overfitting.
Better results in Bayesian optimization on manifolds.
Abstract
We propose practical deep Gaussian process models on Riemannian manifolds, similar in spirit to residual neural networks. With manifold-to-manifold hidden layers and an arbitrary last layer, they can model manifold- and scalar-valued functions, as well as vector fields. We target data inherently supported on manifolds, which is too complex for shallow Gaussian processes thereon. For example, while the latter perform well on high-altitude wind data, they struggle with the more intricate, nonstationary patterns at low altitudes. Our models significantly improve performance in these settings, enhancing prediction quality and uncertainty calibration, and remain robust to overfitting, reverting to shallow models when additional complexity is unneeded. We further showcase our models on Bayesian optimisation problems on manifolds, using stylised examples motivated by robotics, and obtain…
Peer Reviews
Decision·ICLR 2025 Oral
--- Excellent presentation Between the excellent illustrative figures such as figs 1/2 and the compelling narrative flow, this paper is a pleasure to read, which is especially outstanding given the proclivity of some authors to lean into the formalism when discussing ML on manifolds. --- Clear and consequential applications The authors make a good case that Deep GPs can be superior to single-layer GPs, and that manifold data are important, such that getting deep GPs for manifolds is an im
Since this paper uses numerical experiments rather than theory to introduce its methodology, the experiments are crucial to get right. I have some comments on them below. --- Need clearer comparison to prior manifold->manifold GPs. It would be help clarify the novelty and contribution of this article to make a more direct comparison to prior work with GPs on manifolds. Probably it is obvious to the authors, but it would be helpful to say exactly what the shortcomings of Mallasto and Feragen'
1. This paper proposes a simple and effective way to construct manifold-to-manifold deep GPs by modeling the residuals, which is both novel and interesting. 2. The paper is well-written with a good logic flow. It provides a clear introduction of the relevant technical background and a discussion of related works, which makes it easy for readers to understand the context of this work. The description and derivation of the model is also clear and easy to follow. The beautiful figures are a nice a
It seems that the deep GPs saturate quite quickly as depth increases in practice as shown in Figures 3 and 6, especially when the number of data points is small (which is the setting where GPs are typically employed). More generally, deep GPs can be difficult to train and not very scalable in practice, due to, e.g., mode collapse in variational inference. I guess residual deep GPs also share these limitations or perhaps even worse due to the additional components for handling manifolds. In the c
This is a very well written paper, there are quite a few concepts to explain to get the background and the authors do this in a splendid manner. A lot of these topics are quite involved and it would have been easy to stray of the necessary path but the authors do an excellent job of not doing this. Very concrete and to the point, well done! The derivation of the model is well done keeping a good balance between details and intuition of the approach.
A substantive assessment of the weaknesses of the paper. Focus on constructive and actionable insights on how the work could improve towards its stated goals. Be specific, avoid generic remarks. For example, if you believe the contribution lacks novelty, provide references and an explanation as evidence; if you believe experiments are insufficient, explain why and exactly what is missing, etc. If I should play devils advocate with this paper the authors do not do enough to highlight the novelty
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference
MethodsGaussian Process
