From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Yepeng Liu; Hao Li; Liwen Yang; Fangzhen Li; Xudi Ge; Yuliang Gu; kuang Gao; Bing Wang; Guang Chen; Hangjun Ye; Yongchao Xu

arXiv:2602.20630·cs.CV·May 18, 2026

From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Yepeng Liu, Hao Li, Liwen Yang, Fangzhen Li, Xudi Ge, Yuliang Gu, kuang Gao, Bing Wang, Guang Chen, Hangjun Ye, Yongchao Xu

PDF

1 Repo

TL;DR

TraqPoint introduces a reinforcement learning framework that optimizes keypoint detection for sequences, improving long-term trackability and consistency across views in 3D vision tasks.

Contribution

It reframes keypoint detection as a sequential decision problem and proposes a track-aware reward mechanism guided by policy gradients.

Findings

01

TraqPoint outperforms state-of-the-art methods on sparse matching benchmarks.

02

It improves keypoint consistency and distinctiveness across multiple views.

03

The approach enhances relative pose estimation and 3D reconstruction accuracy.

Abstract

Keypoint-based matching is a fundamental component of modern 3D vision systems, such as Structure-from-Motion (SfM) and SLAM. Most existing learning-based methods are trained on image pairs, a paradigm that fails to explicitly optimize for the long-term trackability of keypoints across sequences under challenging viewpoint and illumination changes. In this paper, we reframe keypoint detection as a sequential decision-making problem. We introduce TraqPoint, a novel, end-to-end Reinforcement Learning (RL) framework designed to optimize the \textbf{Tra}ck-\textbf{q}uality (Traq) of keypoints directly on image sequences. Our core innovation is a track-aware reward mechanism that jointly encourages the consistency and distinctiveness of keypoints across multiple views, guided by a policy gradient method. Extensive evaluations on sparse matching benchmarks, including relative pose estimation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaomi-research/traqpoint
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging