Self-supervised Learning of 3D Object Understanding by Data Association   and Landmark Estimation for Image Sequence

Hyeonwoo Yu; Jean Oh

arXiv:2104.07077·cs.CV·April 16, 2021

Self-supervised Learning of 3D Object Understanding by Data Association and Landmark Estimation for Image Sequence

Hyeonwoo Yu, Jean Oh

PDF

Open Access

TL;DR

This paper introduces a self-supervised approach for 3D multi-object pose estimation from image sequences, leveraging data association and landmark estimation to improve accuracy without extensive 3D annotations.

Contribution

It proposes a novel self-supervised learning strategy that uses multiple observations and data association to surpass self-performance limits in 3D object pose estimation.

Findings

01

Improved 3D pose estimation accuracy on KITTI dataset

02

Effective use of image sequences for self-supervised learning

03

Enhanced network performance through iterative fine-tuning

Abstract

In this paper, we propose a self-supervised learningmethod for multi-object pose estimation. 3D object under-standing from 2D image is a challenging task that infers ad-ditional dimension from reduced-dimensional information.In particular, the estimation of the 3D localization or orien-tation of an object requires precise reasoning, unlike othersimple clustering tasks such as object classification. There-fore, the scale of the training dataset becomes more cru-cial. However, it is challenging to obtain large amount of3D dataset since achieving 3D annotation is expensive andtime-consuming. If the scale of the training dataset can beincreased by involving the image sequence obtained fromsimple navigation, it is possible to overcome the scale lim-itation of the dataset and to have efficient adaptation tothe new environment. However, when the self annotation isconducted on single image by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning