Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting
Shilong Jin, Haoran Duan, Litao Hua, Wentao Huang, Yuan Zhou

TL;DR
This paper introduces TD-Attn, a novel framework that improves multi-view consistency in 3D tasks derived from Text-to-Image diffusion models by addressing prior view bias through 3D-aware attention guidance and hierarchical modulation.
Contribution
The paper presents a new framework, TD-Attn, with two modules that mitigate prior view bias and enhance multi-view consistency in 3D generation and editing tasks.
Findings
Significantly improves multi-view consistency in 3D tasks
Enables controllable and precise 3D editing
Serves as a universal plugin for 3D applications
Abstract
Versatile 3D tasks (e.g., generation or editing) that distill from Text-to-Image (T2I) diffusion models have attracted significant research interest for not relying on extensive 3D training data. However, T2I models exhibit limitations resulting from prior view bias, which produces conflicting appearances between different views of an object. This bias causes subject-words to preferentially activate prior view features during cross-attention (CA) computation, regardless of the target view condition. To overcome this limitation, we conduct a comprehensive mathematical analysis to reveal the root cause of the prior view bias in T2I models. Moreover, we find different UNet layers show different effects of prior view in CA. Therefore, we propose a novel framework, TD-Attn, which addresses multi-view inconsistency via two key components: (1) the 3D-Aware Attention Guidance Module (3D-AAG)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection
