Towards Fine-grained Image Classification with Generative Adversarial   Networks and Facial Landmark Detection

Mahdi Darvish; Mahsa Pouramini; Hamid Bahador

arXiv:2109.00891·cs.CV·September 3, 2021

Towards Fine-grained Image Classification with Generative Adversarial Networks and Facial Landmark Detection

Mahdi Darvish, Mahsa Pouramini, Hamid Bahador

PDF

Open Access 1 Repo

TL;DR

This paper enhances fine-grained image classification by using facial landmark detection and improved GAN-based data augmentation with StyleGAN2-ADA, boosting the performance of Vision Transformer models on the Oxford-IIIT Pets dataset.

Contribution

It introduces a novel approach combining facial landmark detection with advanced GAN augmentation to improve fine-grained classification accuracy.

Findings

01

GAN-based augmentation improves classification accuracy

02

Facial landmark cropping enhances image realism

03

Method outperforms standard augmentation techniques

Abstract

Fine-grained classification remains a challenging task because distinguishing categories needs learning complex and local differences. Diversity in the pose, scale, and position of objects in an image makes the problem even more difficult. Although the recent Vision Transformer models achieve high performance, they need an extensive volume of input data. To encounter this problem, we made the best use of GAN-based data augmentation to generate extra dataset instances. Oxford-IIIT Pets was our dataset of choice for this experiment. It consists of 37 breeds of cats and dogs with variations in scale, poses, and lighting, which intensifies the difficulty of the classification task. Furthermore, we enhanced the performance of the recent Generative Adversarial Network (GAN), StyleGAN2-ADA model to generate more realistic images while preventing overfitting to the training set. We did this by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mahdi-darvish/gans-augmented-pet-classifier
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Layer Normalization · Dropout · Softmax · Depthwise Convolution