MaskLab: Instance Segmentation by Refining Object Detection with   Semantic and Direction Features

Liang-Chieh Chen; Alexander Hermans; George Papandreou; Florian; Schroff; Peng Wang; Hartwig Adam

arXiv:1712.04837·cs.CV·December 14, 2017·35 cites

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian, Schroff, Peng Wang, Hartwig Adam

PDF

Open Access

TL;DR

MaskLab is an instance segmentation model that refines object detection by integrating semantic and direction features, enabling accurate separation of object instances within images.

Contribution

The paper introduces MaskLab, a novel framework combining detection, semantic segmentation, and direction prediction for improved instance segmentation.

Findings

01

Achieves comparable performance to state-of-the-art models on COCO benchmark.

02

Effectively separates instances of the same class using direction prediction.

03

Enhances detection with semantic cues and recent segmentation techniques.

Abstract

In this work, we tackle the problem of instance segmentation, the task of simultaneously solving object detection and semantic segmentation. Towards this goal, we present a model, called MaskLab, which produces three outputs: box detection, semantic segmentation, and direction prediction. Building on top of the Faster-RCNN object detector, the predicted boxes provide accurate localization of object instances. Within each region of interest, MaskLab performs foreground/background segmentation by combining semantic and direction prediction. Semantic segmentation assists the model in distinguishing between objects of different semantic classes including background, while the direction prediction, estimating each pixel's direction towards its corresponding center, allows separating instances of the same semantic class. Moreover, we explore the effect of incorporating recent successful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques

MethodsAverage Pooling · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling