Pix2Point: Learning Outdoor 3D Using Sparse Point Clouds and Optimal Transport
R\'emy Leroy, Pauline Trouv\'e-Peloux, Fr\'ed\'eric Champagnat,, Bertrand Le Saux, Marcela Carvalho

TL;DR
Pix2Point is a deep learning method that predicts outdoor 3D point clouds from monocular images using a hybrid neural network and optimal transport, achieving better scene coverage than existing depth methods.
Contribution
It introduces a novel 2D-3D hybrid neural network architecture with optimal transport loss for monocular outdoor 3D point cloud prediction from sparse data.
Findings
Outperforms monocular depth methods in scene coverage
Effective with sparse ground-truth datasets
Handles complete and challenging outdoor scenes
Abstract
Good quality reconstruction and comprehension of a scene rely on 3D estimation methods. The 3D information was usually obtained from images by stereo-photogrammetry, but deep learning has recently provided us with excellent results for monocular depth estimation. Building up a sufficiently large and rich training dataset to achieve these results requires onerous processing. In this paper, we address the problem of learning outdoor 3D point cloud from monocular data using a sparse ground-truth dataset. We propose Pix2Point, a deep learning-based approach for monocular 3D point cloud prediction, able to deal with complete and challenging outdoor scenes. Our method relies on a 2D-3D hybrid neural network architecture, and a supervised end-to-end minimisation of an optimal transport divergence between point clouds. We show that, when trained on sparse point clouds, our simple promising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
