TL;DR
This paper reveals that state-of-the-art text-to-3D models often become insensitive to prompts due to latent sink traps, and proposes a method leveraging unconditional priors for robust out-of-distribution shape editing.
Contribution
It introduces a novel framework that decouples geometric expressivity from linguistic sensitivity, enabling reliable out-of-distribution 3D shape manipulation.
Findings
Models can generate diverse shapes despite prompt insensitivity.
Leveraging unconditional priors allows for high-fidelity shape editing.
The approach bypasses latent sink traps for robust out-of-distribution editing.
Abstract
Text-driven inversion of generative models is a core paradigm for manipulating 2D or 3D content, unlocking numerous applications such as text-based editing, style transfer, or inverse problems. However, it relies on the assumption that generative models remain sensitive to natural language prompts. We demonstrate that for state-of-the-art native text-to-3D generative models, this assumption often collapses. We identify a critical failure mode where generation trajectories are drawn into latent ``sink traps'': regions where the model becomes insensitive to prompt modifications. In these regimes, changes to the input text fail to alter internal representations in a way that alters the output geometry. Crucially, we observe that this is not a limitation of the model's \textit{geometric} expressivity; the same generative models possess the ability to produce a vast diversity of shapes but,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
