RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects
Sascha Kirch (1), Valeria Olyunina (2), Jan Ond\v{r}ej (2), Rafael, Pag\'es (2), Sergio Martin (1), Clara P\'erez-Molina (1) ((1) UNED -, Universidad Nacional de Educaci\'on a Distancia, Madrid, Spain, (2) Volograms, ltd, Dublin, Ireland)

TL;DR
RGB-D-Fusion introduces a multi-stage diffusion-based approach to generate high-resolution depth maps from low-resolution RGB images of humanoid subjects, enhancing depth estimation with novel augmentation techniques.
Contribution
It proposes a two-stage diffusion model for depth super-resolution conditioned on RGB images, incorporating a new depth noise augmentation for robustness.
Findings
Achieves high-quality depth maps from low-res RGB inputs.
Demonstrates improved robustness with depth noise augmentation.
Outperforms existing depth super-resolution methods.
Abstract
We present RGB-D-Fusion, a multi-modal conditional denoising diffusion probabilistic model to generate high resolution depth maps from low-resolution monocular RGB images of humanoid subjects. RGB-D-Fusion first generates a low-resolution depth map using an image conditioned denoising diffusion probabilistic model and then upsamples the depth map using a second denoising diffusion probabilistic model conditioned on a low-resolution RGB-D image. We further introduce a novel augmentation technique, depth noise augmentation, to increase the robustness of our super-resolution model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Cell Image Analysis Techniques · Advanced Vision and Imaging
MethodsDiffusion
