Neural Networks on Symmetric Spaces of Noncompact Type
Xuan Son Nguyen, Shuo Yang, Aymeric Histace

TL;DR
This paper introduces a unified framework for neural networks on symmetric spaces of noncompact type, deriving new distance formulas and applying them to various tasks like image and signal classification, generation, and language inference.
Contribution
It develops a unified formulation for point-to-hyperplane distances on these spaces and derives closed-form expressions for higher-rank cases, enabling new neural network components.
Findings
Effective neural network layers for symmetric spaces of noncompact type.
Improved performance on image, EEG, and language benchmarks.
New geometric tools for neural network design on complex manifolds.
Abstract
Recent works have demonstrated promising performances of neural networks on hyperbolic spaces and symmetric positive definite (SPD) manifolds. These spaces belong to a family of Riemannian manifolds referred to as symmetric spaces of noncompact type. In this paper, we propose a novel approach for developing neural networks on such spaces. Our approach relies on a unified formulation of the distance from a point to a hyperplane on the considered spaces. We show that some existing formulations of the point-to-hyperplane distance can be recovered by our approach under specific settings. Furthermore, we derive a closed-form expression for the point-to-hyperplane distance in higher-rank symmetric spaces of noncompact type equipped with G-invariant Riemannian metrics. The derived distance then serves as a tool to design fully-connected (FC) layers and an attention mechanism for neural…
Peer Reviews
Decision·ICLR 2025 Poster
- The proposed ideas of generalizing FC and attention layer to symmetric spaces are novel and seem reasonable. - They are also general enough to incorporate both hyperbolic spaces and SPD manifolds, which are non-Euclidean spaces of interest and frequent use in ML. - Most mathematical derivations seem rigorous (I could not follow all the details). - The paper made a great effort to show the empirical benefits of the proposed neural network by considering diverse benchmarks.
- Understanding this paper requires a solid background in the geometry of symmetric spaces, and Section 3.2, in particular, is highly abstract and challenging to follow. This complexity may reduce the paper’s accessibility for a broader machine learning audience. Maybe incorporating the materials about decomposition equations from Appendix G.1 to G.3 in the manuscript, along with an explanation of their significance, would improve readability. - The methods to forward propagate the FC layer and
1. The work presents a well-constructed theoretical basis by generalizing point-to-hyperplane distance formulations on symmetric spaces of noncompact type, encompassing both hyperbolic and SPD manifolds. This unified approach is a notable advancement that addresses the limitations in existing methodologies which often focus on narrower manifold types (e.g., Nguyen & Yang, 2023). The paper’s theoretical contribution strengthens its foundation, offering a comprehensive framework applicable across
Sections 4.5.1 and 4.5.2, which are the core practical contributions of this work, are difficult to follow. While the theoretical sections are clearly presented, the implementation of the proposed FC layers and attention mechanism in symmetric spaces feels briefly discussed and lacks an intuitive explanation. A more thorough discussion, with a step-by-step breakdown or additional illustrative examples, would greatly improve accessibility and clarity.
- The approach provide a novel generalization of defining neural networks in the more general symmetric space of noncompact type. It is very appealing that the approach can be utilized on several types manifolds (Section 4.3) - The paper provides is mostly written well to explain the technical background of the material (caveat below). - Experimental results seem promising.
- The connection between the proposed approach and previous ones are not exactly clear. Mostly in how / why the are different (see Questions) - I think the narrative of eventually defining the FC layers in section 4.5 (and the attention mechanism) could be improved. Particularly, the I feel like the connection of expressing affine functions via point-to-hyperplane distances should be further elaborated (L396-404)
Videos
Taxonomy
TopicsMorphological variations and asymmetry · Topological and Geometric Data Analysis · Statistical Mechanics and Entropy
