Generative Expansion of Small Datasets: An Expansive Graph Approach
Vahid Jebraeeli, Bo Jiang, Hamid Krim, Derya Cansever

TL;DR
This paper presents an innovative graph-based generative model that creates large, diverse datasets from minimal samples, improving data augmentation for machine learning tasks.
Contribution
It introduces an expansive synthesis approach combining expander graphs, feature interpolation, and Koopman operators to generate high-quality, large-scale datasets from limited data.
Findings
Generated datasets enable classifiers to perform comparably to those trained on original data.
The model effectively preserves data distribution and feature relationships.
Results demonstrate potential for addressing data scarcity in machine learning.
Abstract
Limited data availability in machine learning significantly impacts performance and generalization. Traditional augmentation methods enhance moderately sufficient datasets. GANs struggle with convergence when generating diverse samples. Diffusion models, while effective, have high computational costs. We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples. It uses expander graph mappings and feature interpolation to preserve data distribution and feature relationships. The model leverages neural networks' non-linear latent space, captured by a Koopman operator, to create a linear feature space for dataset expansion. An autoencoder with self-attention layers and optimal transport refines distributional consistency. We validate by comparing classifiers trained on generated data to those trained on original datasets. Results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning in Materials Science
MethodsDiffusion
