OptiCorNet: Optimizing Sequence-Based Context Correlation for Visual Place Recognition
Zhenyu Li, Tianyi Shang, Pengjie Xu, Ruirui Zhang, Fanchen Kong

TL;DR
OptiCorNet introduces a sequence modeling framework for visual place recognition that leverages spatial-temporal features through a differentiable, end-to-end trainable module, significantly improving robustness in dynamic environments.
Contribution
The paper proposes a novel sequence modeling approach combining a lightweight encoder with a learnable differential operator, enabling end-to-end training for improved place recognition accuracy.
Findings
Outperforms state-of-the-art methods on multiple benchmarks.
Robust to viewpoint and seasonal variations.
Effective sequence-level embedding learning.
Abstract
Visual Place Recognition (VPR) in dynamic and perceptually aliased environments remains a fundamental challenge for long-term localization. Existing deep learning-based solutions predominantly focus on single-frame embeddings, neglecting the temporal coherence present in image sequences. This paper presents OptiCorNet, a novel sequence modeling framework that unifies spatial feature extraction and temporal differencing into a differentiable, end-to-end trainable module. Central to our approach is a lightweight 1D convolutional encoder combined with a learnable differential temporal operator, termed Differentiable Sequence Delta (DSD), which jointly captures short-term spatial context and long-range temporal transitions. The DSD module models directional differences across sequences via a fixed-weight differencing kernel, followed by an LSTM-based refinement and optional residual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
