DeRenderNet: Intrinsic Image Decomposition of Urban Scenes with Shape-(In)dependent Shading Rendering
Yongjie Zhu, Jiajun Tang, Si Li, and Boxin Shi

TL;DR
DeRenderNet is a deep learning model that decomposes outdoor urban scene images into albedo and lighting components, accurately predicts shadows, and improves high-level vision tasks using self-supervised training with videogame data.
Contribution
It introduces a self-supervised approach leveraging videogame data for intrinsic image decomposition of urban scenes, including shape-dependent and shape-independent shading.
Findings
Produces shadow-free albedo maps with detailed textures
Accurately predicts shape-independent shading and shadows
Enhances re-rendering and high-level vision task accuracy
Abstract
We propose DeRenderNet, a deep neural network to decompose the albedo and latent lighting, and render shape-(in)dependent shadings, given a single image of an outdoor urban scene, trained in a self-supervised manner. To achieve this goal, we propose to use the albedo maps extracted from scenes in videogames as direct supervision and pre-compute the normal and shadow prior maps based on the depth maps provided as indirect supervision. Compared with state-of-the-art intrinsic image decomposition methods, DeRenderNet produces shadow-free albedo maps with clean details and an accurate prediction of shadows in the shape-independent shading, which is shown to be effective in re-rendering and improving the accuracy of high-level vision tasks for urban scenes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Remote Sensing and LiDAR Applications
