TL;DR
Pixal3D introduces a pixel-aligned 3D generation method that directly maps image pixels to 3D assets, significantly improving fidelity and enabling scalable, high-quality 3D synthesis from images.
Contribution
It proposes a novel pixel-aligned 3D generation paradigm with explicit pixel-to-3D correspondence, enhancing fidelity and scalability over existing methods.
Findings
Pixal3D achieves high-fidelity 3D asset generation from images.
The method improves pixel-level faithfulness compared to prior approaches.
Pixal3D extends naturally to multi-view scene synthesis.
Abstract
Recent advances in 3D generative models have rapidly improved image-to-3D synthesis quality, enabling higher-resolution geometry and more realistic appearance. Yet fidelity, which measures pixel-level faithfulness of the generated 3D asset to the input image, still remains a central bottleneck. We argue this stems from an implicit 2D-3D correspondence issue: most 3D-native generators synthesize shape in canonical space and inject image cues via attention, leaving pixel-to-3D associations ambiguous. To tackle this issue, we draw inspiration from 3D reconstruction and propose Pixal3D, a pixel-aligned 3D generation paradigm for high-fidelity 3D asset creation from images. Instead of generating in a canonical pose, Pixal3D directly generates 3D in a pixel-aligned way, consistent with the input view. To enable this, we introduce a pixel back-projection conditioning scheme that explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
