Training Deep Neural Networks via Branch-and-Bound

Yuanwei Wu; Ziming Zhang; Guanghui Wang

arXiv:2104.01730·cs.CV·October 26, 2021

Training Deep Neural Networks via Branch-and-Bound

Yuanwei Wu, Ziming Zhang, Guanghui Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces BPGrad, an innovative branch-and-bound based algorithm for deep neural network training that adaptively determines step sizes and achieves optimal solutions efficiently, outperforming some existing stochastic methods.

Contribution

The paper presents BPGrad, a novel approximate algorithm for deep neural network training that adaptively estimates feasible regions and guarantees finite-step optimality.

Findings

01

BPGrad performs well in object recognition, detection, and segmentation tasks.

02

Empirical results show BPGrad compares favorably to other stochastic optimization methods.

03

The method efficiently finds optimal solutions within finite iterations.

Abstract

In this paper, we propose BPGrad, a novel approximate algorithm for deep nueral network training, based on adaptive estimates of feasible region via branch-and-bound. The method is based on the assumption of Lipschitz continuity in objective function, and as a result, it can adaptively determine the step size for the current gradient given the history of previous updates. We prove that, by repeating such a branch-and-pruning procedure, it can achieve the optimal solution within finite iterations. A computationally efficient solver based on BPGrad has been proposed to train the deep neural networks. Empirical results demonstrate that BPGrad solver works well in practice and compares favorably to other stochastic optimization methods in the tasks of object recognition, detection, and segmentation. The code is available at \url{https://github.com/RyanCV/BPGrad}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RyanCV/BPGrad
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and ELM

MethodsRMSProp · Adam