Deep Joint Task Learning for Generic Object Extraction

Xiaolong Wang; Liliang Zhang; Liang Lin; Zhujin Liang; Wangmeng Zuo

arXiv:1502.00743·cs.CV·February 4, 2015·37 cites

Deep Joint Task Learning for Generic Object Extraction

Xiaolong Wang, Liliang Zhang, Liang Lin, Zhujin Liang, Wangmeng Zuo

PDF

Open Access

TL;DR

This paper introduces a joint deep learning framework that simultaneously localizes and segments objects in images, significantly improving accuracy and speed over previous methods by integrating two CNNs with latent variables and an EM optimization process.

Contribution

It presents a novel joint task learning approach combining object localization and segmentation CNNs with latent variables and EM optimization, enhancing performance without hand-crafted features.

Findings

01

Outperforms state-of-the-art methods in accuracy

02

Achieves 1000 times faster processing speed

03

Effectively integrates localization and segmentation tasks

Abstract

This paper investigates how to extract objects-of-interest without relying on hand-craft features and sliding windows approaches, that aims to jointly solve two sub-tasks: (i) rapidly localizing salient objects from images, and (ii) accurately segmenting the objects based on the localizations. We present a general joint task learning framework, in which each task (either object localization or object segmentation) is tackled via a multi-layer convolutional neural network, and the two networks work collaboratively to boost performance. In particular, we propose to incorporate latent variables bridging the two networks in a joint optimization manner. The first network directly predicts the positions and scales of salient objects from raw images, and the latent variables adjust the object localizations to feed the second network that produces pixelwise object masks. An EM-type method is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques