CNN: Single-label to Multi-label

Yunchao Wei; Wei Xia; Junshi Huang; Bingbing Ni; Jian Dong; Yao Zhao,; Shuicheng Yan

arXiv:1406.5726·cs.CV·December 6, 2016

CNN: Single-label to Multi-label

Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao,, Shuicheng Yan

PDF

TL;DR

This paper introduces Hypotheses-CNN-Pooling (HCP), a flexible deep learning framework that effectively handles multi-label image classification without requiring bounding box annotations, outperforming existing methods on Pascal VOC datasets.

Contribution

The paper proposes a novel CNN-based infrastructure that aggregates object hypotheses for multi-label classification without needing explicit bounding box labels or hypothesis annotations.

Findings

01

Achieves 84.2% mAP on VOC2012 with HCP alone.

02

Outperforms state-of-the-art methods by over 7% mAP.

03

Demonstrates robustness to noisy hypotheses and no need for explicit hypothesis labels.

Abstract

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMax Pooling