STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven   Pooling

Yang He; Wei-Chen Chiu; Margret Keuper; Mario Fritz

arXiv:1604.02388·cs.CV·April 27, 2017·2 cites

STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling

Yang He, Wei-Chen Chiu, Margret Keuper, Mario Fritz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a superpixel-based multi-view CNN that leverages spatio-temporal information from multiple RGBD views to improve semantic segmentation accuracy, especially in indoor video scenarios.

Contribution

It presents a novel spatio-temporal pooling layer and a multi-view approach that enhances segmentation by utilizing additional viewpoints and unlabeled frames.

Findings

01

Improves segmentation accuracy over state-of-the-art methods.

02

Utilizes unlabeled frames to boost training effectiveness.

03

Effective on NYU-Depth-V2 and SUN3D datasets.

Abstract

We propose a novel superpixel-based multi-view convolutional neural network for semantic image segmentation. The proposed network produces a high quality segmentation of a single image by leveraging information from additional views of the same scene. Particularly in indoor videos such as captured by robotic platforms or handheld and bodyworn RGBD cameras, nearby video frames provide diverse viewpoints and additional context of objects and scenes. To leverage such information, we first compute region correspondences by optical flow and image boundary-based superpixels. Given these region correspondences, we propose a novel spatio-temporal pooling layer to aggregate information over space and time. We evaluate our approach on the NYU--Depth--V2 and the SUN3D datasets and compare it to various state-of-the-art single-view and multi-view approaches. Besides a general improvement over the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SSAW14/STD2P
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Video Surveillance and Tracking Methods