Bootstrap Your Own Correspondences

Mohamed El Banani; Justin Johnson

arXiv:2106.00677·cs.CV·June 2, 2021

Bootstrap Your Own Correspondences

Mohamed El Banani, Justin Johnson

PDF

TL;DR

BYOC introduces a self-supervised method for learning geometric features from RGB-D videos, eliminating the need for ground-truth annotations and outperforming traditional descriptors in indoor scene registration.

Contribution

It presents a novel self-supervised framework that leverages initial CNN correspondences to learn geometric features without ground-truth data.

Findings

01

Outperforms traditional descriptors in indoor scene registration

02

Competitive with state-of-the-art supervised methods

03

Effective without ground-truth annotations

Abstract

Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: a self-supervised approach that learns visual and geometric features from RGB-D video without relying on ground-truth pose or correspondence. Our key observation is that randomly-initialized CNNs readily provide us with good correspondences; allowing us to bootstrap the learning of both visual and geometric features. Our approach combines classic ideas from point cloud registration with more recent representation learning approaches. We evaluate our approach on indoor scene datasets and find that our method outperforms traditional and learned descriptors, while being competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.