Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent

Matteo Pirotta; Marcello Restelli

arXiv:1712.03428·cs.LG·December 12, 2017

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent

Matteo Pirotta, Marcello Restelli

PDF

Open Access

TL;DR

This paper introduces a cost-sensitive method for automatically adapting batch size in stochastic gradient descent by optimizing the trade-off between gradient estimate accuracy and computational cost, demonstrated on classification tasks.

Contribution

It presents a novel, automated batch size adaptation technique based on optimizing a ratio involving expected improvement and sample cost, improving upon existing methods.

Findings

01

Empirically outperforms related batch size methods on classification tasks.

02

Demonstrates effective automatic batch size tuning in stochastic gradient descent.

03

Provides a practical approach for balancing accuracy and computational cost.

Abstract

In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximation of the expected improvement and the number of samples used to estimate the gradient. The performance of the proposed approach is empirically compared with related methods on popular classification tasks. The work was presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 2016.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM