Focus Longer to See Better:Recursively Refined Attention for   Fine-Grained Image Classification

Prateek Shroff; Tianlong Chen; Yunchao Wei; Zhangyang Wang

arXiv:2005.10979·cs.CV·May 25, 2020

Focus Longer to See Better:Recursively Refined Attention for Fine-Grained Image Classification

Prateek Shroff, Tianlong Chen, Yunchao Wei, Zhangyang Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a recursive attention mechanism that iteratively refines focus on discriminative image parts, improving fine-grained classification accuracy without requiring additional annotations.

Contribution

It proposes a simple, interpretable attention model that mimics human visual focus, enhancing feature extraction for fine-grained image classification using only image-level labels.

Findings

01

Boosts classification accuracy by up to 2%

02

Provides interpretability of focus changes from coarse to fine details

03

Operates without bounding box or part annotations

Abstract

Deep Neural Network has shown great strides in the coarse-grained image classification task. It was in part due to its strong ability to extract discriminative feature representations from the images. However, the marginal visual difference between different classes in fine-grained images makes this very task harder. In this paper, we tried to focus on these marginal differences to extract more representative features. Similar to human vision, our network repetitively focuses on parts of images to spot small discriminative parts among the classes. Moreover, we show through interpretability techniques how our network focus changes from coarse to fine details. Through our experiments, we also show that a simple attention model can aggregate (weighted) these finer details to focus on the most dominant discriminative part of the image. Our network uses only image-level labels and does not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TAMU-VITA/Focus-Longer-to-See-Better
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsInterpretability