Beyond First-Order: Learning Riemannian Geometries for Invariant Visual Place Recognition

Jintao Cheng; Weibin Li; Zhijian He; Jin Wu; Chi Man Vong; Wei Zhang

arXiv:2602.00841·cs.CV·May 18, 2026

Beyond First-Order: Learning Riemannian Geometries for Invariant Visual Place Recognition

Jintao Cheng, Weibin Li, Zhijian He, Jin Wu, Chi Man Vong, Wei Zhang

PDF

TL;DR

This paper introduces Riemannian Invariant Aggregation (RIA), a geometric framework for visual place recognition that models second-order scene structures on the SPD manifold, achieving robust, invariant representations without extensive supervision.

Contribution

The paper proposes RIA, a novel geometric approach that explicitly models scene structure on the SPD manifold, improving invariance and performance in visual place recognition tasks.

Findings

01

RIA achieves zero-shot performance comparable to supervised methods.

02

RIA establishes state-of-the-art accuracy with simple fine-tuning.

03

The approach effectively preserves structural invariants under extreme environmental shifts.

Abstract

Visual Place Recognition (VPR) demands representations robust to drastic environmental and viewpoint shifts. Existing aggregation paradigms either depend on extensive supervised training or rely on first-order pooling, often struggling to preserve structural correlations under extreme shifts or incurring high adaptation costs. In this work, we propose Riemannian Invariant Aggregation (RIA), a unified geometric framework that explicitly models second-order scene structure on the Symmetric Positive Definite (SPD) manifold. By treating perturbations as tractable congruence transformations, RIA leverages geometry-aware Riemannian mappings to project covariance descriptors into a linearized Euclidean space, effectively preserving invariant structural components while suppressing noise. Extensive evaluations demonstrate that RIA achieves zero-shot performance comparable to supervised methods,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications