Weakly Supervised PatchNets: Describing and Aggregating Local Patches   for Scene Recognition

Zhe Wang; Limin Wang; Yali Wang; Bowen Zhang; Yu Qiao

arXiv:1609.00153·cs.CV·April 26, 2017

Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition

Zhe Wang, Limin Wang, Yali Wang, Bowen Zhang, Yu Qiao

PDF

1 Repo

TL;DR

This paper introduces PatchNet, a weakly supervised patch-level model, and VSAD, a hybrid global image representation, achieving state-of-the-art scene recognition results by combining CNN features with descriptor encoding.

Contribution

It presents a novel weakly supervised patch-level network and a hybrid representation for scene recognition, improving accuracy over existing methods.

Findings

01

Achieved 86.2% accuracy on MIT Indoor67

02

Achieved 73.0% accuracy on SUN397

03

Outperformed previous state-of-the-art methods

Abstract

Traditional feature encoding scheme (e.g., Fisher vector) with local descriptors (e.g., SIFT) and recent convolutional neural networks (CNNs) are two classes of successful methods for image recognition. In this paper, we propose a hybrid representation, which leverages the discriminative capacity of CNNs and the simplicity of descriptor encoding schema for image recognition, with a focus on scene recognition. To this end, we make three main contributions from the following aspects. First, we propose a patch-level and end-to-end architecture to model the appearance of local patches, called {\em PatchNet}. PatchNet is essentially a customized network trained in a weakly supervised manner, which uses the image-level supervision to guide the patch-level feature extraction. Second, we present a hybrid visual representation, called {\em VSAD}, by utilizing the robust feature representations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wangzheallen/vsad
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.