Simultaneous Detection and Segmentation

Bharath Hariharan; Pablo Arbel\'aez; Ross Girshick; Jitendra; Malik

arXiv:1407.1808·cs.CV·July 8, 2014·200 cites

Simultaneous Detection and Segmentation

Bharath Hariharan, Pablo Arbel\'aez, Ross Girshick, Jitendra, Malik

PDF

Open Access

TL;DR

This paper introduces a new neural network architecture for simultaneous object detection and instance segmentation, achieving significant improvements over previous methods in accuracy.

Contribution

The paper presents a novel CNN architecture tailored for simultaneous detection and segmentation, integrating category-specific predictions with bottom-up proposals.

Findings

01

7 point boost in SDS performance

02

5 point boost in semantic segmentation accuracy

03

State-of-the-art object detection results

Abstract

We aim to detect all instances of a category in an image and, for each instance, mark the pixels that belong to it. We call this task Simultaneous Detection and Segmentation (SDS). Unlike classical bounding box detection, SDS requires a segmentation and not just a box. Unlike classical semantic segmentation, we require individual object instances. We build on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN [16]), introducing a novel architecture tailored for SDS. We then use category-specific, top- down figure-ground predictions to refine our bottom-up proposals. We show a 7 point boost (16% relative) over our baselines on SDS, a 5 point boost (10% relative) over state-of-the-art on semantic segmentation, and state-of-the-art performance in object detection. Finally, we provide diagnostic tools that unpack performance and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications