Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained Visual Categorization
Zicheng Pan, Xiaohan Yu, Miaohua Zhang, Yongsheng Gao

TL;DR
This paper introduces a mask-guided feature extraction and augmentation technique for ultra-fine-grained visual categorization, addressing challenges of small sample sizes and minimal inter-class variance to improve discriminative feature learning.
Contribution
The proposed method effectively extracts and augments discriminative regions using minimal training data, enhancing ultra-fine-grained classification performance.
Findings
Consistently outperforms ten benchmark methods
Improves feature discriminability visually and quantitatively
Requires only small target region samples for training
Abstract
While the fine-grained visual categorization (FGVC) problems have been greatly developed in the past years, the Ultra-fine-grained visual categorization (Ultra-FGVC) problems have been understudied. FGVC aims at classifying objects from the same species (very similar categories), while the Ultra-FGVC targets at more challenging problems of classifying images at an ultra-fine granularity where even human experts may fail to identify the visual difference. The challenges for Ultra-FGVC mainly comes from two aspects: one is that the Ultra-FGVC often arises overfitting problems due to the lack of training samples; and another lies in that the inter-class variance among images is much smaller than normal FGVC tasks, which makes it difficult to learn discriminative features for each class. To solve these challenges, a mask-guided feature extraction and feature augmentation method is proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
