TOAN: Target-Oriented Alignment Network for Fine-Grained Image   Categorization with Few Labeled Samples

Huaxi Huang; Junjie Zhang; Jian Zhang; Qiang Wu; Chang Xu

arXiv:2005.13820·cs.CV·April 2, 2021

TOAN: Target-Oriented Alignment Network for Fine-Grained Image Categorization with Few Labeled Samples

Huaxi Huang, Junjie Zhang, Jian Zhang, Qiang Wu, Chang Xu

PDF

TL;DR

TOAN introduces a target-oriented alignment approach that enhances fine-grained image categorization with few labeled samples by explicitly reducing intra-class variance through feature matching and discriminative part integration.

Contribution

The paper proposes a novel Target-Oriented Alignment Network (TOAN) that explicitly aligns support and query features and integrates compositional concepts for improved few-shot fine-grained classification.

Findings

01

TOAN outperforms state-of-the-art models on four benchmarks.

02

Explicit feature matching reduces intra-class variance effectively.

03

Discriminative part integration enhances fine-grained feature representation.

Abstract

The challenges of high intra-class variance yet low inter-class fluctuations in fine-grained visual categorization are more severe with few labeled samples, \textit{i.e.,} Fine-Grained categorization problems under the Few-Shot setting (FGFS). High-order features are usually developed to uncover subtle differences between sub-categories in FGFS, but they are less effective in handling the high intra-class variance. In this paper, we propose a Target-Oriented Alignment Network (TOAN) to investigate the fine-grained relation between the target query image and support classes. The feature of each support image is transformed to match the query ones in the embedding feature space, which reduces the disparity explicitly within each category. Moreover, different from existing FGFS approaches devise the high-order features over the global image with less explicit consideration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.