3D Gated Recurrent Fusion for Semantic Scene Completion

Yu Liu; Jie Li; Qingsen Yan; Xia Yuan; Chunxia Zhao; Ian Reid and; Cesar Cadena

arXiv:2002.07269·cs.CV·February 19, 2020·22 cites

3D Gated Recurrent Fusion for Semantic Scene Completion

Yu Liu, Jie Li, Qingsen Yan, Xia Yuan, Chunxia Zhao, Ian Reid and, Cesar Cadena

PDF

Open Access

TL;DR

This paper introduces GRFNet, a novel 3D gated recurrent fusion network that adaptively combines RGB and depth data for improved semantic scene completion, demonstrating superior results on benchmark datasets.

Contribution

The paper presents a new multi-stage fusion strategy and a 3D gated recurrent network for effective multi-modal data fusion in SSC tasks.

Findings

01

GRFNet outperforms existing methods on benchmark datasets.

02

Multi-stage fusion models correlations among different stages.

03

Adaptive fusion improves semantic scene understanding.

Abstract

This paper tackles the problem of data fusion in the semantic scene completion (SSC) task, which can simultaneously deal with semantic labeling and scene completion. RGB images contain texture details of the object(s) which are vital for semantic scene understanding. Meanwhile, depth images capture geometric clues of high relevance for shape completion. Using both RGB and depth images can further boost the accuracy of SSC over employing one modality in isolation. We propose a 3D gated recurrent fusion network (GRFNet), which learns to adaptively select and fuse the relevant information from depth and RGB by making use of the gate and memory modules. Based on the single-stage fusion, we further propose a multi-stage fusion strategy, which could model the correlations among different stages within the network. Extensive experiments on two benchmark datasets demonstrate the superior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Human Pose and Action Recognition