GenLit: Reformulating Single-Image Relighting as Video Generation

Shrisha Bharadwaj; Haiwen Feng; Giorgio Becherini; Victoria Fernandez Abrevaya; Michael J. Black

arXiv:2412.11224·cs.CV·October 24, 2025

GenLit: Reformulating Single-Image Relighting as Video Generation

Shrisha Bharadwaj, Haiwen Feng, Giorgio Becherini, Victoria Fernandez Abrevaya, Michael J. Black

PDF

Open Access 1 Datasets

TL;DR

GenLit leverages a video diffusion model to relight single images by manipulating virtual light sources, bypassing traditional 3D reconstruction and rendering, and producing realistic relighting effects in video sequences.

Contribution

This work introduces GenLit, a novel framework that uses a video diffusion model to perform single-image relighting by directly manipulating virtual light sources, avoiding explicit 3D asset reconstruction.

Findings

01

Model fine-tuned on synthetic data generalizes to real scenes.

02

Produces plausible shadows and inter-reflections.

03

Enables relighting without explicit 3D reconstruction or ray-tracing.

Abstract

Manipulating the illumination of a 3D scene within a single image represents a fundamental challenge in computer vision and graphics. This problem has traditionally been addressed using inverse rendering techniques, which involve explicit 3D asset reconstruction and costly ray-tracing simulations. Meanwhile, recent advancements in visual foundation models suggest that a new paradigm could soon be possible -- one that replaces explicit physical models with networks that are trained on large amounts of image and video data. In this paper, we exploit the implicit scene understanding of a video diffusion model, particularly Stable Video Diffusion, to relight a single image. We introduce GenLit, a framework that distills the ability of a graphics engine to perform light manipulation into a video-generation model, enabling users to directly insert and manipulate a point light in the 3D world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

gbmpi/genlit_dataset
dataset· 13 dl
13 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging

MethodsDiffusion