TL;DR
This paper introduces a mask guided attention (MGA) method that leverages a pre-trained segmentation model to improve fine-grained patchy image classification, especially with limited training data.
Contribution
The novel MGA approach integrates auxiliary supervision from a segmentation model to enhance discriminative feature learning in data-scarce scenarios.
Findings
MGA outperforms state-of-the-art methods on three patchy image datasets.
Ablation study shows MGA improves accuracy by over 2% on key datasets.
MGA effectively filters insignificant image parts, boosting robustness.
Abstract
In this work, we present a novel mask guided attention (MGA) method for fine-grained patchy image classification. The key challenge of fine-grained patchy image classification lies in two folds, ultra-fine-grained inter-category variances among objects and very few data available for training. This motivates us to consider employing more useful supervision signal to train a discriminative model within limited training samples. Specifically, the proposed MGA integrates a pre-trained semantic segmentation model that produces auxiliary supervision signal, i.e., patchy attention mask, enabling a discriminative representation learning. The patchy attention mask drives the classifier to filter out the insignificant parts of images (e.g., common features between different categories), which enhances the robustness of MGA for the fine-grained patchy image classification. We verify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
