UniLight: A Unified Representation for Lighting
Zitian Zhang, Iliyan Georgiev, Michael Fischer, Yannick Hold-Geoffroy, Jean-Fran\c{c}ois Lalonde, Valentin Deschaintre

TL;DR
UniLight introduces a unified latent space for lighting that aligns multiple modalities like text, images, and environment maps, enabling improved cross-modal lighting understanding and manipulation in images.
Contribution
The paper proposes UniLight, a novel joint embedding space for diverse lighting representations, facilitating cross-modal transfer and manipulation in visual tasks.
Findings
Effective multi-modal alignment of lighting representations.
Improved performance in lighting-based retrieval and environment map generation.
Enables flexible lighting control in image synthesis.
Abstract
Lighting has a strong influence on visual appearance, yet understanding and representing lighting in images remains notoriously difficult. Various lighting representations exist, such as environment maps, irradiance, spherical harmonics, or text, but they are incompatible, which limits cross-modal transfer. We thus propose UniLight, a joint latent space as lighting representation, that unifies multiple modalities within a shared embedding. Modality-specific encoders for text, images, irradiance, and environment maps are trained contrastively to align their representations, with an auxiliary spherical-harmonics prediction task reinforcing directional understanding. Our multi-modal data pipeline enables large-scale training and evaluation across three tasks: lighting-based retrieval, environment-map generation, and lighting control in diffusion-based image synthesis. Experiments show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Multimodal Machine Learning Applications
