Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained   Image Recognition

Xiu-Shen Wei; Chen-Wei Xie; Jianxin Wu

arXiv:1605.06878·cs.CV·May 24, 2016·103 cites

Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Image Recognition

Xiu-Shen Wei, Chen-Wei Xie, Jianxin Wu

PDF

Open Access

TL;DR

This paper introduces Mask-CNN, an end-to-end model that localizes parts and selects descriptors for fine-grained image recognition, achieving high accuracy with fewer parameters.

Contribution

It presents a novel fully convolutional Mask-CNN model that localizes discriminative parts and selects features without fully connected layers, improving efficiency and accuracy.

Findings

01

Achieves highest recognition accuracy among compared methods.

02

Uses fewer parameters and lower feature dimensionality.

03

Effectively localizes parts and selects descriptors for recognition.

Abstract

Fine-grained image recognition is a challenging computer vision problem, due to the small inter-class variations caused by highly similar subordinate categories, and the large intra-class variations in poses, scales and rotations. In this paper, we propose a novel end-to-end Mask-CNN model without the fully connected layers for fine-grained recognition. Based on the part annotations of fine-grained images, the proposed model consists of a fully convolutional network to both locate the discriminative parts (e.g., head and torso), and more importantly generate object/part masks for selecting useful and meaningful convolutional descriptors. After that, a four-stream Mask-CNN model is built for aggregating the selected object- and part-level descriptors simultaneously. The proposed Mask-CNN model has the smallest number of parameters, lowest feature dimensionality and highest recognition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization