DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain, Sravanti Addepalli, Pawan Sahu, Priyam Dey, R., Venkatesh Babu

TL;DR
This paper introduces DART, a training strategy that diversifies models with different augmentations, aggregates their weights during training, and improves neural network generalization and domain robustness.
Contribution
The paper proposes DART, a novel training method combining diverse augmentation-based models and weight aggregation, with theoretical and empirical validation for enhanced generalization.
Findings
DART achieves state-of-the-art results on domain generalization benchmarks.
Aggregating models during training improves optimization and generalization.
Theoretical analysis confirms better generalization bounds with DART.
Abstract
Generalization of neural networks is crucial for deploying them safely in the real world. Common training strategies to improve generalization involve the use of data augmentations, ensembling and model averaging. In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch, and show that this can learn a more balanced distribution of features. Further, we propose Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse models using different augmentations (or domains) to explore the loss basin, and further Aggregates their weights to combine their expertise and obtain improved generalization. We find that Repeating the step of Aggregation throughout training improves the overall optimization trajectory and also ensures that the individual models have a sufficiently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
MethodsBalanced Selection
