Joint stereo 3D object detection and implicit surface reconstruction

Shichao Li; Xijie Huang; Zechun Liu; Kwang-Ting Cheng

PMC · DOI:10.1038/s41598-024-64677-2·June 17, 2024

Joint stereo 3D object detection and implicit surface reconstruction

Shichao Li, Xijie Huang, Zechun Liu, Kwang-Ting Cheng

PDF

Open Access

TL;DR

This paper introduces a new framework for 3D object detection and shape reconstruction from stereo images, improving orientation estimation and surface modeling.

Contribution

The novel contribution is a progressive method using Intermediate Geometrical Representations for orientation and implicit shape estimation from stereo RGB images.

Findings

01

The proposed IGRs improve orientation estimation accuracy in SO(3) compared to previous methods.

02

S-3D-RCNN achieves superior 3D scene understanding performance on benchmark datasets.

03

New metrics were developed for evaluating implicit shape estimation on the KITTI benchmark.

Abstract

We present a new learning-based framework S-3D-RCNN that can recover accurate object orientation in SO(3) and simultaneously predict implicit rigid shapes from stereo RGB images. For orientation estimation, in contrast to previous studies that map local appearance to observation angles, we propose a progressive approach by extracting meaningful Intermediate Geometrical Representations (IGRs). This approach features a deep model that transforms perceived intensities from one or two views to object part coordinates to achieve direct egocentric object orientation estimation in the camera coordinate system. To further achieve finer description inside 3D bounding boxes, we investigate the implicit shape estimation problem from stereo images. We model visible object surfaces by designing a point-based representation, augmenting IGRs to explicitly address the unseen surface hallucination…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Chemicals1

SO(3)

Figures16

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition