ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Kejie Li; Daniel DeTone; Steven Chen; Minh Vo; Ian Reid; Hamid; Rezatofighi; Chris Sweeney; Julian Straub; Richard Newcombe

arXiv:2108.10165·cs.CV·August 24, 2021

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Kejie Li, Daniel DeTone, Steven Chen, Minh Vo, Ian Reid, Hamid, Rezatofighi, Chris Sweeney, Julian Straub, Richard Newcombe

PDF

Open Access 1 Repo

TL;DR

ODAM is a system that combines deep learning and graph neural networks to detect, associate, and map 3D objects from RGB videos, advancing scene understanding for AR and robotics.

Contribution

It introduces a novel pipeline integrating 3D object detection, association, and mapping using posed RGB videos with GNNs and super-quadrics for improved accuracy.

Findings

01

Significant improvement over existing RGB-only methods on ScanNet.

02

Effective multi-view geometry optimization of object volumes.

03

Robust object association across frames.

Abstract

Localizing objects and estimating their extent in 3D is an important step towards high-level 3D scene understanding, which has many applications in Augmented Reality and Robotics. We present ODAM, a system for 3D Object Detection, Association, and Mapping using posed RGB videos. The proposed system relies on a deep learning front-end to detect 3D objects from a given RGB frame and associate them to a global object-based map using a graph neural network (GNN). Based on these frame-to-model associations, our back-end optimizes object bounding volumes, represented as super-quadrics, under multi-view geometry constraints and the object scale prior. We validate the proposed system on ScanNet where we show a significant improvement over existing RGB-only methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

likojack/odam
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques

MethodsGraph Neural Network