3D Object Positioning Using Differentiable Multimodal Learning

Sean Zanyk-McLean; Krishna Kumar; Paul Navratil

arXiv:2309.03177·eess.SY·September 7, 2023

3D Object Positioning Using Differentiable Multimodal Learning

Sean Zanyk-McLean, Krishna Kumar, Paul Navratil

PDF

Open Access

TL;DR

This paper introduces a multimodal differentiable learning approach combining simulated Lidar and image data to improve 3D object positioning accuracy and convergence speed, with applications in autonomous vehicle scene understanding.

Contribution

It presents a novel fusion of Lidar and image modalities using differentiable rendering for faster and more accurate object position optimization.

Findings

01

Fusing Lidar with image data accelerates convergence.

02

The method improves accuracy of object positioning in simulated scenes.

03

Potential applications in autonomous vehicle perception systems.

Abstract

This article describes a multi-modal method using simulated Lidar data via ray tracing and image pixel loss with differentiable rendering to optimize an object's position with respect to an observer or some referential objects in a computer graphics scene. Object position optimization is completed using gradient descent with the loss function being influenced by both modalities. Typical object placement optimization is done using image pixel loss with differentiable rendering only, this work shows the use of a second modality (Lidar) leads to faster convergence. This method of fusing sensor input presents a potential usefulness for autonomous vehicles, as these methods can be used to establish the locations of multiple actors in a scene. This article also presents a method for the simulation of multiple types of data to be used in the training of autonomous vehicles.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Advanced Vision and Imaging