ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

Jan-Niklas Dihlmann; Mark Boss; Simon Donne; Andreas Engelhardt; Hendrik P.A. Lensch; Varun Jampani

arXiv:2603.19753·cs.CV·March 23, 2026

ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

Jan-Niklas Dihlmann, Mark Boss, Simon Donne, Andreas Engelhardt, Hendrik P.A. Lensch, Varun Jampani

PDF

Open Access 1 Models 3 Reviews

TL;DR

ReLi3D introduces a fast, unified end-to-end pipeline that reconstructs detailed 3D geometry, materials, and illumination from sparse multi-view images, enabling near-instantaneous relightable 3D asset creation.

Contribution

It is the first to unify geometry, material, and illumination reconstruction into a single pipeline using multi-view constraints and transformer architecture.

Findings

01

Achieves under one second reconstruction time.

02

Demonstrates high accuracy in geometry, materials, and illumination.

03

Generalizes well across synthetic and real-world data.

Abstract

Reconstructing 3D assets from images has long required separate pipelines for geometry reconstruction, material estimation, and illumination recovery, each with distinct limitations and computational overhead. We present ReLi3D, the first unified end-to-end pipeline that simultaneously reconstructs complete 3D geometry, spatially-varying physically-based materials, and environment illumination from sparse multi-view images in under one second. Our key insight is that multi-view constraints can dramatically improve material and illumination disentanglement, a problem that remains fundamentally ill-posed for single-image methods. Key to our approach is the fusion of the multi-view input via a transformer cross-conditioning architecture, followed by a novel unified two-path prediction strategy. The first path predicts the object's structure and appearance, while the second path predicts…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The two-path feed-forward framework jointly reconstructs geometry, materials, and illumination under multi-view constraints, showing a clear and coherent design. 2. Uses Monte Carlo integration with MIS for training supervision, leading to more consistent and realistic reconstructions. 3. The model runs efficiently and shows some degree of cross-domain generalization with mixed synthetic–real training.

Weaknesses

1. Limited evaluation diversity. The test data mostly covers diffuse or moderately lit objects. The paper lacks challenging cases such as metallic, transparent materials, or strong HDR illumination, where disentanglement performance would be most critical. 2. Lack of illumination disentanglement evaluation. The paper does not provide quantitative evaluation of the predicted lighting quality (e.g., comparison against SPAR3D or DiffusionLight) or at least sufficient qualitative examples demonstra

Reviewer 02Rating 4Confidence 3

Strengths

1.The writing is clear and easy to follow. 2.The proposed pipeline is new and makes a meaningful contribution to the field.

Weaknesses

please see weakness for detail

Reviewer 03Rating 8Confidence 4

Strengths

To my knowledge, the suggested approach is the first which jointly reconstructs mesh, PBR, **and** environment (HDR), all in a feedforward manner and at impressive speeds. To me, this constitutes a significant contribution. Additionally, the paper contains novel ideas (more on that below) and is clear and well-written. 1. The idea of fusing arbitrary number of views with one "hero" view and other views with latent mixing (what the authors call "cross-view feature fusion") is novel and insightfu

Weaknesses

1. The most important ablation is missing on whether to use multiple paths (one for geometry & appearance, another for illumination path) or compute them all in a single path. 2. While the method was trained and inferred either on real-world or synthetic data, it would be valuable to see how it generalizes to generated (e.g. with diffusion / flow models) images. This might improve practicality of this approach. 3. While 3D+Image metrics (Table 2) look convincing at first, qualitative results in

Code & Models

Models

🤗
StabilityLabs/ReLi3D
model· 46 dl· ♡ 2
46 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging · 3D Shape Modeling and Analysis