Enhancing Fine-grained Image Classification through Attentive Batch   Training

Duy M. Le; Bao Q. Bui; Anh Tran; Cong Tran; Cuong Pham

arXiv:2412.19606·cs.CV·December 30, 2024

Enhancing Fine-grained Image Classification through Attentive Batch Training

Duy M. Le, Bao Q. Bui, Anh Tran, Cong Tran, Cuong Pham

PDF

Open Access 1 Models

TL;DR

This paper introduces a novel batch training framework with attention mechanisms that significantly improves fine-grained image classification accuracy by leveraging relationships between images within each batch.

Contribution

It proposes Residual Relationship Attention, Relationship Position Encoding, and Relationship Batch Integration, novel modules and techniques that enhance feature extraction in batch training for fine-grained classification.

Findings

01

Achieved +2.78% accuracy on CUB200-2011 dataset.

02

Achieved +3.83% accuracy on Stanford Dog dataset.

03

Set new state-of-the-art 95.79% accuracy on Stanford Dog.

Abstract

Fine-grained image classification, which is a challenging task in computer vision, requires precise differentiation among visually similar object categories. In this paper, we propose 1) a novel module called Residual Relationship Attention (RRA) that leverages the relationships between images within each training batch to effectively integrate visual feature vectors of batch images and 2) a novel technique called Relationship Position Encoding (RPE), which encodes the positions of relationships between original images in a batch and effectively preserves the relationship information between images within the batch. Additionally, we design a novel framework, namely Relationship Batch Integration (RBI), which utilizes RRA in conjunction with RPE, allowing the discernment of vital visual features that may remain elusive when examining a singular image representative of a particular class.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
duyminhle/rra
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI

MethodsSoftmax · Attention Is All You Need