Stochastic Particle Gradient Descent for Infinite Ensembles

Atsushi Nitanda; Taiji Suzuki

arXiv:1712.05438·stat.ML·December 18, 2017·28 cites

Stochastic Particle Gradient Descent for Infinite Ensembles

Atsushi Nitanda, Taiji Suzuki

PDF

Open Access

TL;DR

This paper introduces a stochastic particle gradient descent method for infinite ensembles that effectively handles $L^1$-constraints in a rigorous way, enabling the construction of residual-type networks with proven convergence properties.

Contribution

It proposes a novel stochastic optimization approach for learning probability measures in ensemble models, overcoming regularization challenges and providing convergence guarantees.

Findings

01

The method can handle $L^1$-constraints rigorously.

02

It achieves convergence rates comparable to finite-dimensional stochastic optimization.

03

The resulting ensemble forms a residual-type network with an interior optimality property.

Abstract

The superior performance of ensemble methods with infinite models are well known. Most of these methods are based on optimization problems in infinite-dimensional spaces with some regularization, for instance, boosting methods and convex neural networks use $L^{1}$ -regularization with the non-negative constraint. However, due to the difficulty of handling $L^{1}$ -regularization, these problems require early stopping or a rough approximation to solve it inexactly. In this paper, we propose a new ensemble learning method that performs in a space of probability measures, that is, our method can handle the $L^{1}$ -constraint and the non-negative constraint in a rigorous way. Such an optimization is realized by proposing a general purpose stochastic optimization method for learning probability measures via parameterization using transport maps on base models. As a result of running the method, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Random Matrices and Applications · Markov Chains and Monte Carlo Methods

MethodsEarly Stopping