End-to-end learning of keypoint detector and descriptor for pose   invariant 3D matching

Georgios Georgakis; Srikrishna Karanam; Ziyan Wu; Jan Ernst; and Jana Kosecka

arXiv:1802.07869·cs.CV·May 10, 2018

End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching

Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jan Ernst, and Jana Kosecka

PDF

TL;DR

This paper introduces an end-to-end learning framework for jointly detecting keypoints and extracting descriptors from 3D scans, improving 3D matching accuracy without requiring separate annotations.

Contribution

It presents a novel joint learning approach for keypoint detection and description in 3D data, optimized directly for matching tasks, unlike previous separate or image-focused methods.

Findings

01

Significant improvement over state-of-the-art methods on benchmark datasets.

02

Effective joint optimization of keypoint detection and description.

03

Automatic sampling of positive and negative examples based on pose labels.

Abstract

Finding correspondences between images or 3D scans is at the heart of many computer vision and image retrieval applications and is often enabled by matching local keypoint descriptors. Various learning approaches have been applied in the past to different stages of the matching pipeline, considering detector, descriptor, or metric learning objectives. These objectives were typically addressed separately and most previous work has focused on image data. This paper proposes an end-to-end learning framework for keypoint detection and its representation (descriptor) for 3D depth maps or 3D scans, where the two can be jointly optimized towards task-specific objectives without a need for separate annotations. We employ a Siamese architecture augmented by a sampling layer and a novel score loss function which in turn affects the selection of region proposals. The positive and negative examples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.