SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

Vaibhav Agrawal; Rishubh Parihar; Pradhaan Bhat; Ravi Kiran Sarvadevabhatla; R. Venkatesh Babu

arXiv:2602.23359·cs.CV·February 27, 2026

SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

Vaibhav Agrawal, Rishubh Parihar, Pradhaan Bhat, Ravi Kiran Sarvadevabhatla, R. Venkatesh Babu

PDF

Open Access 1 Models 1 Datasets

TL;DR

SeeThrough3D introduces an occlusion-aware 3D scene representation for text-to-image generation, enabling realistic occlusions and precise camera control in multi-object scenes.

Contribution

The paper presents SeeThrough3D, a novel model that explicitly models occlusions using a translucent 3D scene representation and integrates it with pretrained text-to-image models.

Findings

01

Effective modeling of occlusions in 3D scene generation

02

Generalizes to unseen object categories

03

Enables precise 3D layout control with realistic occlusions

Abstract

We identify occlusion reasoning as a fundamental yet overlooked aspect for 3D layout-conditioned generation. It is essential for synthesizing partially occluded objects with depth-consistent geometry and scale. While existing methods can generate realistic scenes that follow input layouts, they often fail to model precise inter-object occlusions. We propose SeeThrough3D, a model for 3D layout conditioned generation that explicitly models occlusions. We introduce an occlusion-aware 3D scene representation (OSCR), where objects are depicted as translucent 3D boxes placed within a virtual environment and rendered from desired camera viewpoint. The transparency encodes hidden object regions, enabling the model to reason about occlusions, while the rendered viewpoint provides explicit camera control during generation. We condition a pretrained flow based text-to-image image generation model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
va1bhavagrawa1/seethrough3d
model· ♡ 2
♡ 2

Datasets

va1bhavagrawa1/seethrough3d-data
dataset· 104k dl
104k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques