Mesh R-CNN
Georgia Gkioxari, Jitendra Malik, Justin Johnson

TL;DR
Mesh R-CNN is a unified system that detects objects in real-world images and predicts their full 3D shape as triangle meshes, combining 2D detection with 3D shape reconstruction.
Contribution
It introduces Mesh R-CNN, a novel architecture that integrates 2D object detection with 3D mesh prediction using a mesh refinement process.
Findings
Outperforms prior shape prediction methods on ShapeNet.
Successfully detects objects and predicts 3D shapes in real-world images.
Achieves joint 2D detection and 3D shape prediction on Pix3D.
Abstract
Rapid advances in 2D perception have led to systems that accurately detect objects in real-world images. However, these systems make predictions in 2D, ignoring the 3D structure of the world. Concurrently, advances in 3D shape prediction have mostly focused on synthetic benchmarks and isolated objects. We unify advances in these two areas. We propose a system that detects objects in real-world images and produces a triangle mesh giving the full 3D shape of each detected object. Our system, called Mesh R-CNN, augments Mask R-CNN with a mesh prediction branch that outputs meshes with varying topological structure by first predicting coarse voxel representations which are converted to meshes and refined with a graph convolution network operating over the mesh's vertices and edges. We validate our mesh prediction branch on ShapeNet, where we outperform prior work on single-image shape…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Industrial Vision Systems and Defect Detection
MethodsSoftmax · RoIAlign · Convolution
