Learning a Robust Representation via a Deep Network on Symmetric Positive Definite Manifolds
Zhi Gao, Yuwei Wu, Xingyuan Bu, and Yunde Jia

TL;DR
This paper introduces a novel end-to-end deep network that aggregates convolutional features into symmetric Positive Definite matrices using new layers, improving visual classification performance.
Contribution
It proposes a new deep network architecture with specialized layers for SPD matrix aggregation and transformation, advancing feature representation in visual tasks.
Findings
Outperforms state-of-the-art methods in visual classification
Effectively constructs compact, discriminative SPD representations
Demonstrates robustness and improved convergence in experiments
Abstract
Recent studies have shown that aggregating convolutional features of a pre-trained Convolutional Neural Network (CNN) can obtain impressive performance for a variety of visual tasks. The symmetric Positive Definite (SPD) matrix becomes a powerful tool due to its remarkable ability to learn an appropriate statistic representation to characterize the underlying structure of visual features. In this paper, we propose to aggregate deep convolutional features into an SPD matrix representation through the SPD generation and the SPD transformation under an end-to-end deep network. To this end, several new layers are introduced in our network, including a nonlinear kernel aggregation layer, an SPD matrix transformation layer, and a vectorization layer. The nonlinear kernel aggregation layer is employed to aggregate the convolutional features into a real SPD matrix directly. The SPD matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
