RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Sascha Kirch (1); Valeria Olyunina (2); Jan Ond\v{r}ej (2); Rafael; Pag\'es (2); Sergio Martin (1); Clara P\'erez-Molina (1) ((1) UNED -; Universidad Nacional de Educaci\'on a Distancia; Madrid; Spain; (2) Volograms; ltd; Dublin; Ireland)

arXiv:2307.15988·cs.CV·September 25, 2023

RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Sascha Kirch (1), Valeria Olyunina (2), Jan Ond\v{r}ej (2), Rafael, Pag\'es (2), Sergio Martin (1), Clara P\'erez-Molina (1) ((1) UNED -, Universidad Nacional de Educaci\'on a Distancia, Madrid, Spain, (2) Volograms, ltd, Dublin, Ireland)

PDF

Open Access 1 Repo

TL;DR

RGB-D-Fusion introduces a multi-stage diffusion-based approach to generate high-resolution depth maps from low-resolution RGB images of humanoid subjects, enhancing depth estimation with novel augmentation techniques.

Contribution

It proposes a two-stage diffusion model for depth super-resolution conditioned on RGB images, incorporating a new depth noise augmentation for robustness.

Findings

01

Achieves high-quality depth maps from low-res RGB inputs.

02

Demonstrates improved robustness with depth noise augmentation.

03

Outperforms existing depth super-resolution methods.

Abstract

We present RGB-D-Fusion, a multi-modal conditional denoising diffusion probabilistic model to generate high resolution depth maps from low-resolution monocular RGB images of humanoid subjects. RGB-D-Fusion first generates a low-resolution depth map using an image conditioned denoising diffusion probabilistic model and then upsamples the depth map using a second denoising diffusion probabilistic model conditioned on a low-resolution RGB-D image. We further introduce a novel augmentation technique, depth noise augmentation, to increase the robustness of our super-resolution model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sascha-kirch/rgb-d-fusion
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Cell Image Analysis Techniques · Advanced Vision and Imaging

MethodsDiffusion