DeepDetect: Learning All-in-One Dense Keypoints
Shaharyar Ahmed Khan Tareen, Filza Khan Tareen, Xiaojing Yuan

TL;DR
DeepDetect is an all-in-one dense keypoint detector that combines classical methods with deep learning to improve robustness, density, and semantic awareness across challenging scenes.
Contribution
It introduces a novel fusion-based ground-truth creation and trains a lightweight ESPNet model for dense, semantically-aware keypoint detection.
Findings
Outperforms existing detectors on multiple datasets
Achieves high keypoint density and repeatability
Enables more accurate 3D reconstructions
Abstract
Keypoint detection is the foundation of many computer vision tasks, including image registration, structure-from-motion, 3D reconstruction, visual odometry, and SLAM. Traditional detectors (SIFT, ORB, BRISK, FAST, etc.) and learning-based methods (SuperPoint, R2D2, QuadNet, LIFT, etc.) have shown strong performance gains yet suffer from key limitations: sensitivity to photometric changes, low keypoint density and repeatability, limited adaptability to challenging scenes, and lack of semantic understanding, often failing to prioritize visually important regions. We present DeepDetect, an intelligent, all-in-one, dense detector that unifies the strengths of classical detectors using deep learning. Firstly, we create ground-truth masks by fusing outputs of 7 keypoint and 2 edge detectors, extracting diverse visual cues from corners and blobs to prominent edges and textures in the images.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
