TL;DR
MESC-3D introduces a novel approach for single-image 3D reconstruction by actively mining semantic cues and leveraging prior knowledge, leading to improved accuracy, robustness, and generalization, especially in zero-shot scenarios.
Contribution
The paper proposes an Effective Semantic Mining Module and a 3D Semantic Prior Learning Module to enhance semantic feature utilization and incorporate prior knowledge for better 3D reconstruction from a single image.
Findings
Significant improvements in reconstruction quality and robustness.
Strong generalization capabilities demonstrated, including zero-shot performance.
Effective semantic cue mining enhances 3D reconstruction accuracy.
Abstract
Reconstructing 3D shapes from a single image plays an important role in computer vision. Many methods have been proposed and achieve impressive performance. However, existing methods mainly focus on extracting semantic information from images and then simply concatenating it with 3D point clouds without further exploring the concatenated semantics. As a result, these entangled semantic features significantly hinder the reconstruction performance. In this paper, we propose a novel single-image 3D reconstruction method called Mining Effective Semantic Cues for 3D Reconstruction from a Single Image (MESC-3D), which can actively mine effective semantic cues from entangled features. Specifically, we design an Effective Semantic Mining Module to establish connections between point clouds and image semantic attributes, enabling the point clouds to autonomously select the necessary information.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
