Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation
Xiang Li, Zirui Wang, Zixuan Huang, James M. Rehg

TL;DR
Cue3D introduces a comprehensive framework to quantify how different image cues like shading, texture, and silhouette influence single-image 3D generation, revealing the importance of shape and geometric cues for model performance.
Contribution
It provides the first model-agnostic benchmark to systematically evaluate the influence of individual image cues on 3D generation methods.
Findings
Shape meaningfulness is key for generalization.
Shading is a crucial geometric cue for 3D generation.
Models over-rely on silhouettes and show varied sensitivities to cues.
Abstract
Humans and traditional computer vision methods rely on a diverse set of monocular cues to infer 3D structure from a single image, such as shading, texture, silhouette, etc. While recent deep generative models have dramatically advanced single-image 3D generation, it remains unclear which image cues these methods actually exploit. We introduce Cue3D, the first comprehensive, model-agnostic framework for quantifying the influence of individual image cues in single-image 3D generation. Our unified benchmark evaluates seven state-of-the-art methods, spanning regression-based, multi-view, and native 3D generative paradigms. By systematically perturbing cues such as shading, texture, silhouette, perspective, edges, and local continuity, we measure their impact on 3D output quality. Our analysis reveals that shape meaningfulness, not texture, dictates generalization. Geometric cues,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
