DeepDetect: Learning All-in-One Dense Keypoints

Shaharyar Ahmed Khan Tareen; Filza Khan Tareen; Xiaojing Yuan

arXiv:2510.17422·cs.CV·April 21, 2026

DeepDetect: Learning All-in-One Dense Keypoints

Shaharyar Ahmed Khan Tareen, Filza Khan Tareen, Xiaojing Yuan

PDF

TL;DR

DeepDetect is an all-in-one dense keypoint detector that combines classical methods with deep learning to improve robustness, density, and semantic awareness across challenging scenes.

Contribution

It introduces a novel fusion-based ground-truth creation and trains a lightweight ESPNet model for dense, semantically-aware keypoint detection.

Findings

01

Outperforms existing detectors on multiple datasets

02

Achieves high keypoint density and repeatability

03

Enables more accurate 3D reconstructions

Abstract

Keypoint detection is the foundation of many computer vision tasks, including image registration, structure-from-motion, 3D reconstruction, visual odometry, and SLAM. Traditional detectors (SIFT, ORB, BRISK, FAST, etc.) and learning-based methods (SuperPoint, R2D2, QuadNet, LIFT, etc.) have shown strong performance gains yet suffer from key limitations: sensitivity to photometric changes, low keypoint density and repeatability, limited adaptability to challenging scenes, and lack of semantic understanding, often failing to prioritize visually important regions. We present DeepDetect, an intelligent, all-in-one, dense detector that unifies the strengths of classical detectors using deep learning. Firstly, we create ground-truth masks by fusing outputs of 7 keypoint and 2 edge detectors, extracting diverse visual cues from corners and blobs to prominent edges and textures in the images.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.