The Rate of Convergence of AdaBoost
Indraneel Mukherjee, Cynthia Rudin, Robert E. Schapire

TL;DR
This paper analyzes the convergence rate of AdaBoost, demonstrating polynomial bounds on how quickly it approaches the minimum exponential loss without weak-learning assumptions, and establishing optimality of these bounds.
Contribution
It provides the first convergence bounds for AdaBoost without weak-learning assumptions and shows these bounds are tight up to constant factors.
Findings
AdaBoost's exponential loss converges within ε in polynomial rounds.
Lower bounds show polynomial dependence on parameters is necessary.
Convergence rate of O(1/ε) rounds is optimal up to constants.
Abstract
The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration , the exponential loss of AdaBoost's computed parameter vector will be at most more than that of any parameter vector of -norm bounded by in a number of rounds that is at most a polynomial in and . We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within iterations, AdaBoost achieves a value of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
