S-TREK: Sequential Translation and Rotation Equivariant Keypoints for local feature extraction
Emanuele Santellani, Christian Sormann, Mattia Rossi, Andreas Kuhn,, Friedrich Fraundorfer

TL;DR
S-TREK is a new local feature extraction method combining a translation and rotation equivariant keypoint detector with a lightweight descriptor, trained via reinforcement learning to improve repeatability and pose recovery.
Contribution
Introduces S-TREK, a novel equivariant keypoint detector and descriptor framework trained with reinforcement learning for enhanced local feature extraction.
Findings
Outperforms state-of-the-art methods in repeatability
Achieves superior pose recovery, especially under in-plane rotations
Proven effective across multiple benchmark datasets
Abstract
In this work we introduce S-TREK, a novel local feature extractor that combines a deep keypoint detector, which is both translation and rotation equivariant by design, with a lightweight deep descriptor extractor. We train the S-TREK keypoint detector within a framework inspired by reinforcement learning, where we leverage a sequential procedure to maximize a reward directly related to keypoint repeatability. Our descriptor network is trained following a "detect, then describe" approach, where the descriptor loss is evaluated only at those locations where keypoints have been selected by the already trained detector. Extensive experiments on multiple benchmarks confirm the effectiveness of our proposed method, with S-TREK often outperforming other state-of-the-art methods in terms of repeatability and quality of the recovered poses, especially when dealing with in-plane rotations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition
