DART: Diversify-Aggregate-Repeat Training Improves Generalization of   Neural Networks

Samyak Jain; Sravanti Addepalli; Pawan Sahu; Priyam Dey; R.; Venkatesh Babu

arXiv:2302.14685·cs.LG·June 13, 2023·1 cites

DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks

Samyak Jain, Sravanti Addepalli, Pawan Sahu, Priyam Dey, R., Venkatesh Babu

PDF

Open Access 1 Repo

TL;DR

This paper introduces DART, a training strategy that diversifies models with different augmentations, aggregates their weights during training, and improves neural network generalization and domain robustness.

Contribution

The paper proposes DART, a novel training method combining diverse augmentation-based models and weight aggregation, with theoretical and empirical validation for enhanced generalization.

Findings

01

DART achieves state-of-the-art results on domain generalization benchmarks.

02

Aggregating models during training improves optimization and generalization.

03

Theoretical analysis confirms better generalization bounds with DART.

Abstract

Generalization of neural networks is crucial for deploying them safely in the real world. Common training strategies to improve generalization involve the use of data augmentations, ensembling and model averaging. In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch, and show that this can learn a more balanced distribution of features. Further, we propose Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse models using different augmentations (or domains) to explore the loss basin, and further Aggregates their weights to combine their expertise and obtain improved generalization. We find that Repeating the step of Aggregation throughout training improves the overall optimization trajectory and also ensures that the individual models have a sufficiently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

val-iisc/dart
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications

MethodsBalanced Selection