Semantic RGB-D Image Synthesis

Shijie Li; Rong Li; Juergen Gall

arXiv:2308.11356·cs.CV·September 20, 2023

Semantic RGB-D Image Synthesis

Shijie Li, Rong Li, Juergen Gall

PDF

Open Access

TL;DR

This paper introduces a multi-modal semantic RGB-D image synthesis method that generates realistic RGB-D images from semantic labels, improving data diversity and segmentation accuracy in privacy-sensitive applications.

Contribution

It proposes a novel generator and discriminator architecture for multi-modal data that enhances realism and semantic consistency in synthesized images.

Findings

01

Outperforms previous uni-modal methods significantly

02

Mixing real and generated images improves segmentation accuracy

03

Generates realistic RGB-D images from semantic label maps

Abstract

Collecting diverse sets of training images for RGB-D semantic image segmentation is not always possible. In particular, when robots need to operate in privacy-sensitive areas like homes, the collection is often limited to a small set of locations. As a consequence, the annotated images lack diversity in appearance and approaches for RGB-D semantic image segmentation tend to overfit the training data. In this paper, we thus introduce semantic RGB-D image synthesis to address this problem. It requires synthesising a realistic-looking RGB-D image for a given semantic label map. Current approaches, however, are uni-modal and cannot cope with multi-modal data. Indeed, we show that extending uni-modal approaches to multi-modal data does not perform well. In this paper, we therefore propose a generator for multi-modal data that separates modal-independent information of the semantic layout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization