
TL;DR
This paper provides a detailed analysis of $L_2$Boosting for linear regression, revealing its solution path, unique cycling behavior, and limitations in correlated data, along with a proposed augmentation method to improve its performance.
Contribution
It derives an exact formula for $L_2$Boosting's steps, characterizes its solution path and cycling behavior, and introduces a data augmentation technique to address issues in correlated problems.
Findings
Exact expression for number of steps in $L_2$Boosting
Identification of active set cycling behavior
Data augmentation improves performance in correlated data
Abstract
We consider Boosting, a special case of Friedman's generic boosting algorithm applied to linear regression under -loss. We study Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe Boosting's solution path, to describe new tools for studying its path, and to characterize some of the algorithm's unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
