# Learning Common Representation from RGB and Depth Images

**Authors:** Giorgio Giannone, Boris Chidlovskii

arXiv: 1812.06873 · 2018-12-18

## TL;DR

This paper introduces a novel deep learning architecture that learns a shared representation from RGB and depth images, enabling improved semantic segmentation and depth prediction, including cross-modality tasks.

## Contribution

It proposes a common deep representation for RGB-D data, replacing traditional feature fusion, allowing cross-modality inference and joint learning for segmentation and depth estimation.

## Key findings

- Effective in semantic segmentation and depth prediction
- Enables cross-modality tasks using a single modality at test time
- Demonstrates superior performance on RGB-D datasets

## Abstract

We propose a new deep learning architecture for the tasks of semantic segmentation and depth prediction from RGB-D images. We revise the state of art based on the RGB and depth feature fusion, where both modalities are assumed to be available at train and test time. We propose a new architecture where the feature fusion is replaced with a common deep representation. Combined with an encoder-decoder type of the network, the architecture can jointly learn models for semantic segmentation and depth estimation based on their common representation. This representation, inspired by multi-view learning, offers several important advantages, such as using one modality available at test time to reconstruct the missing modality. In the RGB-D case, this enables the cross-modality scenarios, such as using depth data for semantically segmentation and the RGB images for depth estimation. We demonstrate the effectiveness of the proposed network on two publicly available RGB-D datasets. The experimental results show that the proposed method works well in both semantic segmentation and depth estimation tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.06873/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1812.06873/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1812.06873/full.md

---
Source: https://tomesphere.com/paper/1812.06873