Keypoints Tracking via Transformer Networks
Oleksii Nasypanyi, Francois Rameau

TL;DR
This paper introduces a fast and robust keypoints tracking method using transformer networks, capable of real-time performance and resilient to challenging conditions like occlusion and illumination changes.
Contribution
The work presents a novel two-stage transformer-based architecture for real-time, robust keypoints tracking, addressing speed and repeatability issues of previous deep learning methods.
Findings
Achieves competitive accuracy in keypoints tracking
Demonstrates robustness under adverse conditions
Operates in real-time with high efficiency
Abstract
In this thesis, we propose a pioneering work on sparse keypoints tracking across images using transformer networks. While deep learning-based keypoints matching have been widely investigated using graph neural networks - and more recently transformer networks, they remain relatively too slow to operate in real-time and are particularly sensitive to the poor repeatability of the keypoints detectors. In order to address these shortcomings, we propose to study the particular case of real-time and robust keypoints tracking. Specifically, we propose a novel architecture which ensures a fast and robust estimation of the keypoints tracking between successive images of a video sequence. Our method takes advantage of a recent breakthrough in computer vision, namely, visual transformer networks. Our method consists of two successive stages, a coarse matching followed by a fine localization of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Video Analysis and Summarization
