Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual   Categorization in Few-Shot Conditions

Tianxu Wu; Shuo Ye; Shuhuang Chen; Qinmu Peng; Xinge You

arXiv:2309.08097·cs.CV·May 16, 2024·1 cites

Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions

Tianxu Wu, Shuo Ye, Shuhuang Chen, Qinmu Peng, Xinge You

PDF

Open Access

TL;DR

This paper introduces the Detail Reinforcement Diffusion Model (DRDM), a novel data augmentation approach leveraging large models to improve fine-grained visual categorization under few-shot conditions by enhancing subtle subclass differences.

Contribution

The paper proposes DRDM with discriminative semantic recombination and spatial knowledge reference modules, effectively utilizing large model knowledge for fine-grained data augmentation in few-shot learning.

Findings

01

DRDM achieves consistent performance improvements in FGVC tasks.

02

The model effectively captures subtle subclass differences.

03

Enhanced decision boundary expansion improves classification accuracy.

Abstract

The challenge in fine-grained visual categorization lies in how to explore the subtle differences between different subclasses and achieve accurate discrimination. Previous research has relied on large-scale annotated data and pre-trained deep models to achieve the objective. However, when only a limited amount of samples is available, similar methods may become less effective. Diffusion models have been widely adopted in data augmentation due to their outstanding diversity in data generation. However, the high level of detail required for fine-grained images makes it challenging for existing methods to be directly employed. To address this issue, we propose a novel approach termed the detail reinforcement diffusion model~(DRDM), which leverages the rich knowledge of large models for fine-grained data augmentation and comprises two key components including discriminative semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques

MethodsDiffusion