Submodular Batch Selection for Training Deep Neural Networks

K J Joseph; Vamshi Teja R; Krishnakant Singh; Vineeth N; Balasubramanian

arXiv:1906.08771·cs.LG·June 21, 2019·1 cites

Submodular Batch Selection for Training Deep Neural Networks

K J Joseph, Vamshi Teja R, Krishnakant Singh, Vineeth N, Balasubramanian

PDF

Open Access 1 Repo

TL;DR

This paper proposes a submodular function-based batch selection method for training deep neural networks, improving generalization by selecting more informative and diverse samples during training.

Contribution

It introduces a novel submodular formulation for batch selection and an efficient greedy algorithm, enhancing training effectiveness over standard methods.

Findings

01

Improved model generalization across datasets and hyperparameters

02

Outperforms stochastic gradient descent and baseline sampling strategies

03

Effective in various learning rate, batch size, and distance metric settings

Abstract

Mini-batch gradient descent based methods are the de facto algorithms for training neural network architectures today. We introduce a mini-batch selection strategy based on submodular function maximization. Our novel submodular formulation captures the informativeness of each sample and diversity of the whole subset. We design an efficient, greedy algorithm which can give high-quality solutions to this NP-hard combinatorial optimization problem. Our extensive experiments on standard datasets show that the deep models trained using the proposed batch selection strategy provide better generalization than Stochastic Gradient Descent as well as a popular baseline sampling strategy across different learning rates, batch sizes, and distance metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VamshiTeja/SMDL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms