MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation
Kerui Ren, Jiayang Bai, Linning Xu, Lihan Jiang, Jiangmiao Pang, Mulin Yu, Bo Dai

TL;DR
MV-CoLight is a two-stage framework that enables efficient, consistent object compositing across 2D images and 3D scenes by modeling lighting and shadows directly, outperforming existing methods in realism and scalability.
Contribution
It introduces a novel feed-forward architecture with Hilbert curve-based mapping and a large-scale dataset for illumination-consistent object compositing.
Findings
Achieves state-of-the-art results on standard benchmarks.
Demonstrates robustness on real-world scenes.
Provides a scalable and efficient compositing framework.
Abstract
Object compositing offers significant promise for augmented reality (AR) and embodied intelligence applications. Existing approaches predominantly focus on single-image scenarios or intrinsic decomposition techniques, facing challenges with multi-view consistency, complex scenes, and diverse lighting conditions. Recent inverse rendering advancements, such as 3D Gaussian and diffusion-based methods, have enhanced consistency but are limited by scalability, heavy data requirements, or prolonged reconstruction time per scene. To broaden its applicability, we introduce MV-CoLight, a two-stage framework for illumination-consistent object compositing in both 2D images and 3D scenes. Our novel feed-forward architecture models lighting and shadows directly, avoiding the iterative biases of diffusion-based methods. We employ a Hilbert curve-based mapping to align 2D image inputs with 3D Gaussian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques
