Matrix Manifold Neural Networks++

Xuan Son Nguyen; Shuo Yang; Aymeric Histace

arXiv:2405.19206·stat.ML·January 6, 2026

Matrix Manifold Neural Networks++

Xuan Son Nguyen, Shuo Yang, Aymeric Histace

PDF

3 Reviews

TL;DR

This paper extends neural network architectures to matrix manifolds like SPD and Grassmann, introducing new layers and methods that leverage algebraic structures for improved performance in tasks like action recognition.

Contribution

It introduces fully-connected and convolutional layers for SPD neural networks, along with MLR on SPSD manifolds and backpropagation methods for Grassmann manifolds.

Findings

01

Effective in human action recognition

02

Improves node classification accuracy

03

Demonstrates the viability of manifold-based neural networks

Abstract

Deep neural networks (DNNs) on Riemannian manifolds have garnered increasing interest in various applied areas. For instance, DNNs on spherical and hyperbolic manifolds have been designed to solve a wide range of computer vision and nature language processing tasks. One of the key factors that contribute to the success of these networks is that spherical and hyperbolic manifolds have the rich algebraic structures of gyrogroups and gyrovector spaces. This enables principled and effective generalizations of the most successful DNNs to these manifolds. Recently, some works have shown that many concepts in the theory of gyrogroups and gyrovector spaces can also be generalized to matrix manifolds such as Symmetric Positive Definite (SPD) and Grassmann manifolds. As a result, some building blocks for SPD and Grassmann neural networks, e.g., isometric models and multinomial logistic regression…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 3· reject, not good enoughConfidence 5

Strengths

1. The paper is nicely motivated in the context of gyro vector representation. 2. The formulation of convolution and fully connected layers are nicely derived and formulated from Nguyen, 2022a;b.

Weaknesses

1. The paper mostly based on formulation by Nguyen, 2022a;b. I don't want to treat this as a weakness, but a lack of strength. 2. The experiment results are rather naive, in recent years there is increasing literature in manifold DNNs so one wants to see thorough experimentations both in terms of different setting. Also some comparisons are missing including that with Chakraborty et al. (2020).

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

The paper is well written, easy to follow, and the mathematics appear sound. The theoretical contributions are extensive and significant, with the potential of impacting research on matrix manifold neural networks in different areas. Showing how to compute the Grassman logarithmic map in a differentiable way is another contribution. The experimental evaluation for action recognition includes existing SPD deep learning methods and shows improvements.

Weaknesses

The formulation of convolutional layers is more like a sketch of the proof than an actual definition. The experimental evaluation for node classification as shown in the main text is fairly weak: the authors did not include any baseline, at least a few Euclidean-featured GNNs should have been included, as well as hyperbolic graph neural networks (Chami et al., and newer architectures). There is actually no definition of what GyroSpd++ in the paper beyond a description of the matrix dimensions.

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

- The paper builds upon the theory of gyrovector spaces and extends it to matrix manifolds such as SPD and Grassmann manifolds. - This paper defines a new way to build the basic blocks of neural networks - fully-connected and convolutional layers, for SPD matrices, and most specially for Grassmann, which is rarely discovered in the field. - The authors demonstrate the mathematical rigor and effectiveness of their approach. - The authors provide an ablation study and comparison against state-of-t

Weaknesses

- The experimental evaluation is based on some small and not generally used dataset, (not as big as ImageNet or equivalent). This limits the overall generalizability of the proposed approach. - The network structures are limited with FC and CNN in most cases, while other network structures are missing, such as attention/ activation/ etc. - It would be helpful if the author can clearly highlight the novel contributions and how they differ from or improve upon the existing theories discussed in ot

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLogistic Regression