Joint Shadow Generation and Relighting via Light-Geometry Interaction Maps
Shan Wang, Peixia Li, Chenchen Xu, Ziang Cheng, Jiayu Yang, Hongdong Li, Pulak Purkait

TL;DR
This paper introduces Light-Geometry Interaction maps, a new light-aware occlusion representation from monocular depth, enabling joint shadow generation and relighting with improved realism and physical consistency.
Contribution
It presents LGI maps for light-shadow interaction encoding, a unified pipeline for shadow and relighting, and a large-scale benchmark dataset for training and evaluation.
Findings
Enhanced realism and consistency in shadow rendering
Effective joint shadow generation and relighting
Significant improvements over prior disjoint methods
Abstract
We propose Light-Geometry Interaction (LGI) maps, a novel representation that encodes light-aware occlusion from monocular depth. Unlike ray tracing, which requires full 3D reconstruction, LGI captures essential light-shadow interactions reliably and accurately, computed from off-the-shelf 2.5D depth map predictions. LGI explicitly ties illumination direction to geometry, providing a physics-inspired prior that constrains generative models. Without such prior, these models often produce floating shadows, inconsistent illumination, and implausible shadow geometry. Building on this representation, we propose a unified pipeline for joint shadow generation and relighting - unlike prior methods that treat them as disjoint tasks - capturing the intrinsic coupling of illumination and shadowing essential for modeling indirect effects. By embedding LGI into a bridge-matching generative backbone,…
Peer Reviews
Decision·ICLR 2026 Poster
- This paper presents the first generative method for the joint synthesis of shadows and object relighting from a single 2D RGB image. A key innovation is the introduction of a control mechanism conditioned on monocular depth, which enables the generation of continuous and coherent shadow and relighting effects in direct response to continuous variations in illumination. - This paper introduces ShadRel, a large-scale synthetic dataset developed for the tasks of shadow generation and object relig
The visual results presented in the paper are compelling. However, the experiments primarily showcase objects with relatively simple geometric structures. To more rigorously assess the robustness and generalization capabilities of the proposed method, I would encourage the authors to include results on more structurally complex objects. For instance, objects with fine-grained details, intricate parts, or significant self-occlusion (e.g., a bicycle, a detailed sculpture, or a potted plant) would
1. LGI provides an intuitive utilization of the depth prior and suits the pipeline objective as joint shadow generation and object relighting. 2. The proposed pipeline integrates the modern large model backbones, such as depth estimation models and latent bridge matching models. 3. Dataset contribution: The provided dataset, ShadRel, provides a valuable contribution to this community.
1. Dependence on monocular depth information. A significant improvement of the proposed pipeline is introducing the depth information into the LGI representation, which also raises concerns. Monocular depth is an ill-posed estimation, and it frequently fails to provide accurate estimations, thus leading to malicious priors in the generation pipeline. This requires further clarification. At the same time, although LGI provides better performance compared to previous baselines, the depth informati
1. This paper proposes a novel shadow generation and relighting framework. The LGI map can model the light-geometry interactions from the depth map which encodes useful shadow and illumination information, enabling the model to generate reasonable relighting results. 2. Extensive experiments show that the method achieve better performance than baselines.
1. I have some doubts about the proposed LGI map and whether it can accurately represent shadow and lighting information. Consider the point $p$ in fig.3 (a), and for this point, $e^d_n$ is always negative, which means that the elevation of light is always higher than the horizon, so this point can see the light source and its $c^m_1$ is negative. But for the point $p$ in fig.3 (c), this point is in the shadow, but its $c^m_1$ is also negative. So why the $c^m_1$ indicates the potential start of
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Interactive and Immersive Displays
