Fine-Grained Few Shot Learning with Foreground Object Transformation
Chaofei Wang, Shiji Song, Qisen Yang, Xiang Li, Gao Huang

TL;DR
This paper introduces foreground object transformation (FOT), a data augmentation technique for fine-grained few-shot learning that improves classification performance by generating additional samples through posture transformation of foreground objects.
Contribution
The paper proposes a novel foreground object transformation method that enhances existing few-shot learning algorithms for fine-grained classification by generating more training samples.
Findings
FOT significantly improves performance of baseline methods.
FOT achieves state-of-the-art results in FG-FSL tasks.
FOT is effective on general FSL tasks.
Abstract
Traditional fine-grained image classification generally requires abundant labeled samples to deal with the low inter-class variance but high intra-class variance problem. However, in many scenarios we may have limited samples for some novel sub-categories, leading to the fine-grained few shot learning (FG-FSL) setting. To address this challenging task, we propose a novel method named foreground object transformation (FOT), which is composed of a foreground object extractor and a posture transformation generator. The former aims to remove image background, which tends to increase the difficulty of fine-grained image classification as it amplifies the intra-class variance while reduces inter-class variance. The latter transforms the posture of the foreground object to generate additional samples for the novel sub-category. As a data augmentation method, FOT can be conveniently applied to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
