Stochastic Particle Gradient Descent for Infinite Ensembles
Atsushi Nitanda, Taiji Suzuki

TL;DR
This paper introduces a stochastic particle gradient descent method for infinite ensembles that effectively handles $L^1$-constraints in a rigorous way, enabling the construction of residual-type networks with proven convergence properties.
Contribution
It proposes a novel stochastic optimization approach for learning probability measures in ensemble models, overcoming regularization challenges and providing convergence guarantees.
Findings
The method can handle $L^1$-constraints rigorously.
It achieves convergence rates comparable to finite-dimensional stochastic optimization.
The resulting ensemble forms a residual-type network with an interior optimality property.
Abstract
The superior performance of ensemble methods with infinite models are well known. Most of these methods are based on optimization problems in infinite-dimensional spaces with some regularization, for instance, boosting methods and convex neural networks use -regularization with the non-negative constraint. However, due to the difficulty of handling -regularization, these problems require early stopping or a rough approximation to solve it inexactly. In this paper, we propose a new ensemble learning method that performs in a space of probability measures, that is, our method can handle the -constraint and the non-negative constraint in a rigorous way. Such an optimization is realized by proposing a general purpose stochastic optimization method for learning probability measures via parameterization using transport maps on base models. As a result of running the method, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Random Matrices and Applications · Markov Chains and Monte Carlo Methods
MethodsEarly Stopping
