S-TREK: Sequential Translation and Rotation Equivariant Keypoints for   local feature extraction

Emanuele Santellani; Christian Sormann; Mattia Rossi; Andreas Kuhn,; Friedrich Fraundorfer

arXiv:2308.14598·cs.CV·August 29, 2023

S-TREK: Sequential Translation and Rotation Equivariant Keypoints for local feature extraction

Emanuele Santellani, Christian Sormann, Mattia Rossi, Andreas Kuhn,, Friedrich Fraundorfer

PDF

Open Access

TL;DR

S-TREK is a new local feature extraction method combining a translation and rotation equivariant keypoint detector with a lightweight descriptor, trained via reinforcement learning to improve repeatability and pose recovery.

Contribution

Introduces S-TREK, a novel equivariant keypoint detector and descriptor framework trained with reinforcement learning for enhanced local feature extraction.

Findings

01

Outperforms state-of-the-art methods in repeatability

02

Achieves superior pose recovery, especially under in-plane rotations

03

Proven effective across multiple benchmark datasets

Abstract

In this work we introduce S-TREK, a novel local feature extractor that combines a deep keypoint detector, which is both translation and rotation equivariant by design, with a lightweight deep descriptor extractor. We train the S-TREK keypoint detector within a framework inspired by reinforcement learning, where we leverage a sequential procedure to maximize a reward directly related to keypoint repeatability. Our descriptor network is trained following a "detect, then describe" approach, where the descriptor loss is evaluated only at those locations where keypoints have been selected by the already trained detector. Extensive experiments on multiple benchmarks confirm the effectiveness of our proposed method, with S-TREK often outperforming other state-of-the-art methods in terms of repeatability and quality of the recovered poses, especially when dealing with in-plane rotations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition