Multi-Modal RGB-D Scene Recognition Across Domains

Andrea Ferreri; Silvia Bucci; Tatiana Tommasi

arXiv:2103.14672·cs.CV·September 8, 2021

Multi-Modal RGB-D Scene Recognition Across Domains

Andrea Ferreri, Silvia Bucci, Tatiana Tommasi

PDF

1 Repo

TL;DR

This paper investigates the domain shift problem in multi-modal RGB-D scene recognition across different cameras, and proposes an adaptive self-supervised translation method to improve cross-domain generalization.

Contribution

It identifies the domain shift issue in multi-modal scene datasets and introduces a novel self-supervised translation approach to enhance model robustness across camera types.

Findings

01

The domain shift significantly reduces recognition accuracy across different cameras.

02

The proposed self-supervised translation improves cross-camera scene recognition performance.

03

Experimental results validate the effectiveness of the adaptive approach.

Abstract

Scene recognition is one of the basic problems in computer vision research with extensive applications in robotics. When available, depth images provide helpful geometric cues that complement the RGB texture information and help to identify discriminative scene image features. Depth sensing technology developed fast in the last years and a great variety of 3D cameras have been introduced, each with different acquisition properties. However, those properties are often neglected when targeting big data collections, so multi-modal images are gathered disregarding their original nature. In this work, we put under the spotlight the existence of a possibly severe domain shift issue within multi-modality scene recognition datasets. As a consequence, a scene classification model trained on one camera may not generalize on data from a different camera, only providing a low recognition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

silvia1993/multi-modal_rgb-d_scene_recognition_across_domains
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.