TL;DR
This paper introduces a scalable, kernel-independent hierarchical matrix method for Gaussian process maximum likelihood estimation, enabling efficient computation of gradients, Hessians, and Fisher information in near-linear time.
Contribution
It proposes a novel hierarchical matrix approximation for covariance matrices that allows for efficient derivatives and Fisher information computation, improving scalability for Gaussian process MLEs.
Findings
Achieves quasilinear $O(n \, \log^2 n)$ complexity for key computations.
Provides accurate MLEs and confidence intervals comparable to exact methods.
Demonstrates scalability and practical implementation details.
Abstract
We present a kernel-independent method that applies hierarchical matrices to the problem of maximum likelihood estimation for Gaussian processes. The proposed approximation provides natural and scalable stochastic estimators for its gradient and Hessian, as well as the expected Fisher information matrix, that are computable in quasilinear complexity for a large range of models. To accomplish this, we (i) choose a specific hierarchical approximation for covariance matrices that enables the computation of their exact derivatives and (ii) use a stabilized form of the Hutchinson stochastic trace estimator. Since both the observed and expected information matrices can be computed in quasilinear complexity, covariance matrices for MLEs can also be estimated efficiently. After discussing the associated mathematics, we demonstrate the scalability of the method, discuss details…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
