Behavior of linear L2-boosting algorithms in the vanishing learning rate asymptotic
Cl\'ement Dombry (UBFC, LMB), Youssef Esstafa (ENSAI)

TL;DR
This paper analyzes the asymptotic behavior of linear L2-boosting algorithms as the learning rate approaches zero, deriving a deterministic limit described by a linear differential equation and examining its implications for training and test errors.
Contribution
It provides a rigorous characterization of the vanishing learning rate limit for linear L2-boosting, including a differential equation description and error analysis.
Findings
Deterministic limit characterized by a linear differential equation.
Training and test errors of the limit are thoroughly analyzed.
Numerical experiments illustrate the theoretical results.
Abstract
We investigate the asymptotic behaviour of gradient boosting algorithms when the learning rate converges to zero and the number of iterations is rescaled accordingly. We mostly consider L2-boosting for regression with linear base learner as studied in B{\"u}hlmann and Yu (2003) and analyze also a stochastic version of the model where subsampling is used at each step (Friedman 2002). We prove a deterministic limit in the vanishing learning rate asymptotic and characterize the limit as the unique solution of a linear differential equation in an infinite dimensional function space. Besides, the training and test error of the limiting procedure are thoroughly analyzed. We finally illustrate and discuss our result on a simple numerical experiment where the linear L2-boosting operator is interpreted as a smoothed projection and time is related to its number of degrees of freedom.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
