# Object-Part Attention Model for Fine-grained Image Classification

**Authors:** Yuxin Peng, Xiangteng He, and Junjie Zhao

arXiv: 1704.01740 · 2017-11-29

## TL;DR

This paper introduces a weakly supervised object-part attention model (OPAM) that improves fine-grained image classification by jointly localizing objects and discriminative parts without requiring detailed annotations, leveraging spatial constraints and multi-view features.

## Contribution

The novel OPAM integrates object-level and part-level attention with spatial constraints, enabling accurate localization and discrimination without heavy annotation efforts.

## Key findings

- Achieves state-of-the-art performance on four datasets.
- Effectively localizes objects and parts without annotations.
- Enhances discrimination through spatial constraints.

## Abstract

Fine-grained image classification is to recognize hundreds of subcategories belonging to the same basic-level category, such as 200 subcategories belonging to the bird, which is highly challenging due to large variance in the same subcategory and small variance among different subcategories. Existing methods generally first locate the objects or parts and then discriminate which subcategory the image belongs to. However, they mainly have two limitations: (1) Relying on object or part annotations which are heavily labor consuming. (2) Ignoring the spatial relationships between the object and its parts as well as among these parts, both of which are significantly helpful for finding discriminative parts. Therefore, this paper proposes the object-part attention model (OPAM) for weakly supervised fine-grained image classification, and the main novelties are: (1) Object-part attention model integrates two level attentions: object-level attention localizes objects of images, and part-level attention selects discriminative parts of object. Both are jointly employed to learn multi-view and multi-scale features to enhance their mutual promotions. (2) Object-part spatial constraint model combines two spatial constraints: object spatial constraint ensures selected parts highly representative, and part spatial constraint eliminates redundancy and enhances discrimination of selected parts. Both are jointly employed to exploit the subtle and local differences for distinguishing the subcategories. Importantly, neither object nor part annotations are used in our proposed approach, which avoids the heavy labor consumption of labeling. Comparing with more than 10 state-of-the-art methods on 4 widely-used datasets, our OPAM approach achieves the best performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.01740/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1704.01740/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/1704.01740/full.md

---
Source: https://tomesphere.com/paper/1704.01740