Fast and Stable Riemannian Metrics on SPD Manifolds via Cholesky Product Geometry
Ziheng Chen, Yue Song, Xiao-Jun Wu, Nicu Sebe

TL;DR
This paper introduces two novel Riemannian metrics for SPD matrices, derived from Cholesky factorization, offering improved computational efficiency and numerical stability for SPD neural network applications.
Contribution
We propose the Power–Cholesky and Bures–Wasserstein–Cholesky metrics, leveraging Cholesky product geometry for faster, more stable SPD matrix computations in neural networks.
Findings
The new metrics outperform existing ones in stability and efficiency.
They enable effective SPD neural network components like classifiers and residual blocks.
Experiments confirm robustness and practical advantages.
Abstract
Recent advances in Symmetric Positive Definite (SPD) matrix learning show that Riemannian metrics are fundamental to effective SPD neural networks. Motivated by this, we revisit the geometry of the Cholesky factors and uncover a simple product structure that enables convenient metric design. Building on this insight, we propose two fast and stable SPD metrics, Power--Cholesky Metric (PCM) and Bures--Wasserstein--Cholesky Metric (BWCM), derived via Cholesky decomposition. Compared with existing SPD metrics, the proposed metrics provide closed-form operators, computational efficiency, and improved numerical stability. We further apply our metrics to construct Riemannian Multinomial Logistic Regression (MLR) classifiers and residual blocks for SPD neural networks. Experiments on SPD deep learning, numerical stability analyses, and tensor interpolation demonstrate the effectiveness,…
Peer Reviews
Decision·ICLR 2026 Poster
- Rigorous analysis of proposed Riemannian metrics for the Cholesky manifold: from geodesics to pullback metrics for SPD manifold. - Stability experiments: empirical demonstration of failure probability for the derived metrics.
- I am not sure if the composition of logarithm and exponentiation from LCE is not possible to be done using some standard numerical tricks like logsumexp or smth in this direction. Could you give the exact formula for the unstable part of LCE and reason why there is no any standard workaround? I will be ok with raising the score if this part is properly addressed, because it seems to be one of the key advantages of choosing your method. - Given that $\theta$ is a central contribution of this wo
- This paper aims at improving SPD neural networks which have applications in many fields. - In general, the exposition is clear and the paper is easy to follow.
- Contribution is marginal.
Replacing logarithms with power transforms to address numerical instability in LCM is well-motivated and empirically validated. The proposed metrics maintain closed-form. This paper is in general sound, the usability of the two proposed metrics is critical for deep learning integration.
This paper is in general well-wriiten, but there are still some concerns: 1. Several Riemannian and gyro-operators used in this work are only local defined. In particular, certain expressions are well-defined only under positivity constraints such as $L^{\beta} + K^{\beta} - I \in D_{n}^{++}$ 2. 𝜃 power is a main contribution, however, the authors don't present how to choose 𝜃 3. Recent SPD learning works with GBWM-based classifiers are not experimentally compared 4. The stability of the DPM and
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics and Applications · Mathematical Dynamics and Fractals · Geometric and Algebraic Topology
