AdaBoost is not an Optimal Weak to Strong Learner
Mikael M{\o}ller H{\o}gsgaard, Kasper Green Larsen, Martin Ritzert

TL;DR
This paper demonstrates that AdaBoost, a widely used boosting algorithm, is not sample-optimal and requires more data than theoretically necessary to achieve a given accuracy, unlike the recently developed optimal algorithms.
Contribution
The paper proves that AdaBoost and similar classic boosting algorithms are sub-optimal in sample complexity compared to the best possible algorithms.
Findings
AdaBoost's sample complexity is at least logarithmically higher than optimal.
Other classic boosting algorithms also exhibit sub-optimal sample complexity.
Optimal algorithms can achieve desired accuracy with fewer training samples.
Abstract
AdaBoost is a classic boosting algorithm for combining multiple inaccurate classifiers produced by a weak learner, to produce a strong learner with arbitrarily high accuracy when given enough training data. Determining the optimal number of samples necessary to obtain a given accuracy of the strong learner, is a basic learning theoretic question. Larsen and Ritzert (NeurIPS'22) recently presented the first provably optimal weak-to-strong learner. However, their algorithm is somewhat complicated and it remains an intriguing question whether the prototypical boosting algorithm AdaBoost also makes optimal use of training samples. In this work, we answer this question in the negative. Concretely, we show that the sample complexity of AdaBoost, and other classic variations thereof, are sub-optimal by at least one logarithmic factor in the desired accuracy of the strong learner.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Machine Learning and Data Classification
