Semantic-Guided Generative Image Augmentation Method with Diffusion   Models for Image Classification

Bohan Li; Xiao Xu; Xinghao Wang; Yutai Hou; Yunlong Feng; Feng Wang,; Xuanliang Zhang; Qingfu Zhu; Wanxiang Che

arXiv:2302.02070·cs.CV·January 19, 2024

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

Bohan Li, Xiao Xu, Xinghao Wang, Yutai Hou, Yunlong Feng, Feng Wang,, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

PDF

Open Access

TL;DR

SGID is a novel image augmentation method using diffusion models guided by labels and captions to enhance image diversity while preserving semantic integrity, improving classification performance.

Contribution

This paper introduces SGID, a diffusion-based augmentation technique that balances image diversity and semantic consistency using label and caption guidance.

Findings

01

SGID outperforms baseline augmentation methods on multiple models and datasets.

02

SGID can be combined with other augmentations for further performance gains.

03

Quantitative and qualitative evaluations confirm semantic preservation and diversity.

Abstract

Existing image augmentation methods consist of two categories: perturbation-based methods and generative methods. Perturbation-based methods apply pre-defined perturbations to augment an original image, but only locally vary the image, thus lacking image diversity. In contrast, generative methods bring more image diversity in the augmented images but may not preserve semantic consistency, thus incorrectly changing the essential semantics of the original image. To balance image diversity and semantic consistency in augmented images, we propose SGID, a Semantic-guided Generative Image augmentation method with Diffusion models for image classification. Specifically, SGID employs diffusion models to generate augmented images with good image diversity. More importantly, SGID takes image labels and captions as guidance to maintain semantic consistency between the augmented and original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Face recognition and analysis

MethodsDiffusion · RandAugment · Mixup