Learning and Crafting for the Wide Multiple Baseline Stereo

Dmytro Mishkin

arXiv:2112.12027·cs.CV·December 23, 2021

Learning and Crafting for the Wide Multiple Baseline Stereo

Dmytro Mishkin

PDF

Open Access

TL;DR

This thesis advances wide multiple baseline stereo (WxBS) by developing new descriptors, learning methods, and algorithms that improve matching across diverse imaging conditions, and introduces a comprehensive benchmark dataset.

Contribution

It introduces the WxBS problem, a new dataset, a novel descriptor training loss, a geometric feature learning method, an improved correspondence strategy, and the MODS matching algorithm.

Findings

01

HardNet descriptor achieves state-of-the-art performance.

02

MODS algorithm surpasses previous methods in large viewpoint changes.

03

A comprehensive benchmark for local features and robust estimation is provided.

Abstract

This thesis introduces the wide multiple baseline stereo (WxBS) problem. WxBS, a generalization of the standard wide baseline stereo problem, considers the matching of images that simultaneously differ in more than one image acquisition factor such as viewpoint, illumination, sensor type, or where object appearance changes significantly, e.g., over time. A new dataset with the ground truth, evaluation metric and baselines has been introduced. The thesis presents the following improvements of the WxBS pipeline. (i) A loss function, called HardNeg, for learning a local image descriptor that relies on hard negative mining within a mini-batch and on the maximization of the distance between the closest positive and the closest negative patches. (ii) The descriptor trained with the HardNeg loss, called HardNet, is compact and shows state-of-the-art performance in standard matching, patch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging