The Many Faces of Optimal Weak-to-Strong Learning

Mikael M{\o}ller H{\o}gsgaard; Kasper Green Larsen; Markus Engelund; Mathiasen

arXiv:2408.17148·cs.LG·September 2, 2024

The Many Faces of Optimal Weak-to-Strong Learning

Mikael M{\o}ller H{\o}gsgaard, Kasper Green Larsen, Markus Engelund, Mathiasen

PDF

Open Access 1 Video 1 Reviews

TL;DR

This paper introduces a simple, sample-optimal boosting algorithm that partitions data, applies AdaBoost to each subset, and combines classifiers via majority vote, achieving optimal sample complexity and promising empirical performance.

Contribution

The paper presents a new, simple boosting algorithm with provably optimal sample complexity and the fastest runtime among such algorithms, along with the first empirical comparison.

Findings

01

Potential outperforming of previous algorithms on large datasets

02

Algorithm achieves optimal sample complexity

03

Simplest description among sample-optimal boosting methods

Abstract

Boosting is an extremely successful idea, allowing one to combine multiple low accuracy classifiers into a much more accurate voting classifier. In this work, we present a new and surprisingly simple Boosting algorithm that obtains a provably optimal sample complexity. Sample optimal Boosting algorithms have only recently been developed, and our new algorithm has the fastest runtime among all such algorithms and is the simplest to describe: Partition your training data into 5 disjoint pieces of equal size, run AdaBoost on each, and combine the resulting classifiers via a majority vote. In addition to this theoretical contribution, we also perform the first empirical comparison of the proposed sample optimal Boosting algorithms. Our pilot empirical study suggests that our new algorithm might outperform previous algorithms on large data sets.

Peer Reviews

Decision·NeurIPS 2024 poster

Reviewer 01Rating 7Confidence 5

Strengths

- Very strong and interesting result - Mathematically sound, based by my judgement - Good presentation, self-contained and well-structured

Weaknesses

N/A

Videos

The Many Faces of Optimal Weak-to-Strong Learning· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques