Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification
Bohan Li, Xiao Xu, Xinghao Wang, Yutai Hou, Yunlong Feng, Feng Wang,, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

TL;DR
SGID is a novel image augmentation method using diffusion models guided by labels and captions to enhance image diversity while preserving semantic integrity, improving classification performance.
Contribution
This paper introduces SGID, a diffusion-based augmentation technique that balances image diversity and semantic consistency using label and caption guidance.
Findings
SGID outperforms baseline augmentation methods on multiple models and datasets.
SGID can be combined with other augmentations for further performance gains.
Quantitative and qualitative evaluations confirm semantic preservation and diversity.
Abstract
Existing image augmentation methods consist of two categories: perturbation-based methods and generative methods. Perturbation-based methods apply pre-defined perturbations to augment an original image, but only locally vary the image, thus lacking image diversity. In contrast, generative methods bring more image diversity in the augmented images but may not preserve semantic consistency, thus incorrectly changing the essential semantics of the original image. To balance image diversity and semantic consistency in augmented images, we propose SGID, a Semantic-guided Generative Image augmentation method with Diffusion models for image classification. Specifically, SGID employs diffusion models to generate augmented images with good image diversity. More importantly, SGID takes image labels and captions as guidance to maintain semantic consistency between the augmented and original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Face recognition and analysis
MethodsDiffusion · RandAugment · Mixup
