D2-Net: A Trainable CNN for Joint Detection and Description of Local Features
Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef, Sivic, Akihiko Torii, Torsten Sattler

TL;DR
D2-Net introduces a CNN that jointly detects and describes local features, improving stability and performance in challenging imaging conditions by leveraging large-scale SfM data for training.
Contribution
The paper presents a novel CNN architecture that combines detection and description of local features, trained without additional annotations, enhancing robustness in difficult scenarios.
Findings
Achieves state-of-the-art results on Aachen Day-Night and InLoc datasets.
Demonstrates competitive performance on image matching and 3D reconstruction benchmarks.
Uses large-scale SfM reconstructions for training without extra annotations.
Abstract
In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
