Advancing Fine-Grained Classification by Structure and Subject   Preserving Augmentation

Eyal Michaeli; Ohad Fried

arXiv:2406.14551·cs.CV·February 11, 2025·1 cites

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

Eyal Michaeli, Ohad Fried

PDF

Open Access 1 Repo 1 Video

TL;DR

SaSPA introduces a novel data augmentation method for fine-grained visual classification that preserves structure and subject details, enhancing dataset diversity without relying on real images, and outperforms existing techniques.

Contribution

The paper proposes SaSPA, a flexible augmentation approach that conditions on image edges and subjects, significantly improving FGVC performance over traditional and recent generative methods.

Findings

01

SaSPA outperforms baseline augmentation methods across multiple FGVC benchmarks.

02

Synthetic data can effectively complement real data, with an optimal ratio depending on dataset size.

03

Conditioning on image structure enhances the quality and diversity of generated images.

Abstract

Fine-grained visual classification (FGVC) involves classifying closely related sub-classes. This task is difficult due to the subtle differences between classes and the high intra-class variance. Moreover, FGVC datasets are typically small and challenging to gather, thus highlighting a significant need for effective data augmentation. Recent advancements in text-to-image diffusion models offer new possibilities for augmenting classification datasets. While these models have been used to generate training data for classification tasks, their effectiveness in full-dataset training of FGVC models remains under-explored. Recent techniques that rely on Text2Image generation or Img2Img methods, often struggle to generate images that accurately represent the class while modifying them to a degree that significantly increases the dataset's diversity. To address these challenges, we present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eyalmichaeli/saspa-aug
pytorchOfficial

Videos

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation· slideslive

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction

MethodsDiffusion