Submodular Batch Selection for Training Deep Neural Networks
K J Joseph, Vamshi Teja R, Krishnakant Singh, Vineeth N, Balasubramanian

TL;DR
This paper proposes a submodular function-based batch selection method for training deep neural networks, improving generalization by selecting more informative and diverse samples during training.
Contribution
It introduces a novel submodular formulation for batch selection and an efficient greedy algorithm, enhancing training effectiveness over standard methods.
Findings
Improved model generalization across datasets and hyperparameters
Outperforms stochastic gradient descent and baseline sampling strategies
Effective in various learning rate, batch size, and distance metric settings
Abstract
Mini-batch gradient descent based methods are the de facto algorithms for training neural network architectures today. We introduce a mini-batch selection strategy based on submodular function maximization. Our novel submodular formulation captures the informativeness of each sample and diversity of the whole subset. We design an efficient, greedy algorithm which can give high-quality solutions to this NP-hard combinatorial optimization problem. Our extensive experiments on standard datasets show that the deep models trained using the proposed batch selection strategy provide better generalization than Stochastic Gradient Descent as well as a popular baseline sampling strategy across different learning rates, batch sizes, and distance metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms
