Beyond Cropping and Rotation: Automated Evolution of Powerful Task-Specific Augmentations with Generative Models
Judah Goldfeder, Shreyes Kaliyur, Vaibhav Sourirajan, Patrick Minwan Puma, Philippe Martin Wyder, Yuhang Hu, Jiong Lin, Hod Lipson

TL;DR
EvoAug is an automated pipeline that uses generative models and evolutionary algorithms to learn task-specific data augmentations, improving model robustness and performance in fine-grained and few-shot learning tasks.
Contribution
The paper introduces EvoAug, a novel method combining generative models with evolutionary algorithms to automatically learn structured, hierarchical augmentations tailored to specific tasks.
Findings
EvoAug outperforms traditional augmentation methods in various tasks.
The learned augmentations often align with domain knowledge.
EvoAug is effective even in low-data scenarios.
Abstract
Data augmentation has long been a cornerstone for reducing overfitting in vision models, with methods like AutoAugment automating the design of task-specific augmentations. Recent advances in generative models, such as conditional diffusion and few-shot NeRFs, offer a new paradigm for data augmentation by synthesizing data with significantly greater diversity and realism. However, unlike traditional augmentations like cropping or rotation, these methods introduce substantial changes that enhance robustness but also risk degrading performance if the augmentations are poorly matched to the task. In this work, we present EvoAug, an automated augmentation learning pipeline, which leverages these generative models alongside an efficient evolutionary algorithm to learn optimal task-specific augmentations. Our pipeline introduces a novel approach to image augmentation that learns stochastic…
Peer Reviews
Decision·Submitted to ICLR 2026
Timely and interesting direction — Combining generative models with augmentation search is appealing, especially for low-data regimes where augmentation matters a lot. Focus on low-data regime — The fact that they tackle few-shot / fine-grained classification, rather than just standard large‐scale regimes, gives practical relevance. Domain-alignment — The finding that discovered augmentations “align with domain knowledge” is interesting: suggests the search may rediscover meaningful transforma
Limited scalability / generality — The work appears focused on low-data tasks with few-shot classification; it’s unclear how well this would scale to large-scale datasets, high resolution images, or many‐class settings. Baseline comparisons — It’s unclear if the comparisons include state‐of‐the‐art augmentation methods (e.g., AutoAugment, RandAugment, AugMix) combined with strong generative models, or whether the generative augmentation is compared fairly.
The paper addresses a very important and cutting-edge problem: how to use powerful generative models (Diffusion, NeRFs) as data augmentation operators and automatically integrate them into the learning pipeline. This is a challenging but high-potential direction that goes beyond traditional augmentation methods. The authors employed a commendable and rigorous approach in their experimental setup (Section 4.1) by intentionally selecting the subset with the lowest baseline accuracy from 10 subset
The paper's core premise is that integrating expensive generative operators (Diffusion, NeRF) provides an advantage on fine-grained tasks. However, in the 2-shot and 5-shot experimental results (Tables 1 and 2), the performance of EvoAug (Learned) is not superior to (and often worse than) strong baselines like RandAugment. The authors themselves admit the results are "mixed". Given the significant additional computational overhead EvoAug introduces in both search (EA) and application (generative
1. The combination of controlled diffusion, NeRF-based transformations, and evolutionary search represents a significant methodological innovation in data augmentation. 2. Addressing augmentation learning in few-shot and one-shot settings is both timely and important, as most prior work assumes large datasets. 3. The introduction of clustering-based and loss-based fitness functions for augmentation policy learning without labels is a thoughtful and practical contribution. 4. Algorithms are ex
1. The evolutionary search settings (population size, mutation rate, depth-2 trees) are fixed; no sensitivity analysis is presented. 2. The contribution of each augmentation operator type (diffusion, NeRF, classical) is not isolated; it’s unclear how much generative components actually contribute over strong classical baselines. 3. While the paper focuses on classification, there is little empirical evidence suggesting EvoAug’s adaptability to other tasks (e.g., detection, segmentation), despi
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
