LDM3D-VR: Latent Diffusion Model for 3D VR

Gabriela Ben Melech Stan; Diana Wofk; Estelle Aflalo; Shao-Yen Tseng,; Zhipeng Cai; Michael Paulitsch; Vasudev Lal

arXiv:2311.03226·cs.CV·November 7, 2023·2 cites

LDM3D-VR: Latent Diffusion Model for 3D VR

Gabriela Ben Melech Stan, Diana Wofk, Estelle Aflalo, Shao-Yen Tseng,, Zhipeng Cai, Michael Paulitsch, Vasudev Lal

PDF

Open Access 2 Models 1 Datasets

TL;DR

LDM3D-VR introduces diffusion models for generating panoramic RGBD images and upscaling low-resolution inputs, advancing VR content creation with text-guided depth and resolution enhancement.

Contribution

It presents novel diffusion models specifically designed for joint RGB and depth map generation and upscaling in virtual reality applications.

Findings

01

Successful generation of panoramic RGBD from text prompts

02

Effective upscaling of low-resolution RGBD images

03

Models outperform existing related methods

Abstract

Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

diffusers/community-pipelines-mirror
dataset· 29k dl
29k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques

MethodsDiffusion