Enhancing Image Classification in Small and Unbalanced Datasets through Synthetic Data Augmentation
Neil De La Fuente, Mireia Maj\'o, Irina Luzko, Henry, C\'ordova, and Gloria Fern\'andez-Esparrach, Jorge Bernal

TL;DR
This paper introduces a synthetic data augmentation method using class-specific VAEs and latent space interpolation to improve image classification accuracy in small, imbalanced datasets, especially in medical imaging.
Contribution
The paper presents a novel augmentation strategy that generates realistic synthetic data via VAEs and latent space interpolation, enhancing model performance on small, imbalanced datasets.
Findings
Over 18% accuracy increase in underrepresented class
6% improvement in global accuracy and precision
Effective in small, imbalanced medical image datasets
Abstract
Accurate and robust medical image classification is a challenging task, especially in application domains where available annotated datasets are small and present high imbalance between target classes. Considering that data acquisition is not always feasible, especially for underrepresented classes, our approach introduces a novel synthetic augmentation strategy using class-specific Variational Autoencoders (VAEs) and latent space interpolation to improve discrimination capabilities. By generating realistic, varied synthetic data that fills feature space gaps, we address issues of data scarcity and class imbalance. The method presented in this paper relies on the interpolation of latent representations within each class, thus enriching the training set and improving the model's generalizability and diagnostic accuracy. The proposed strategy was tested in a small dataset of 321 images…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning and Data Classification · AI in cancer detection
MethodsSparse Evolutionary Training
