A Theory of Probabilistic Boosting, Decision Trees and Matryoshki

Etienne Grossmann

arXiv:cs/0607110·cs.LG·May 23, 2007

A Theory of Probabilistic Boosting, Decision Trees and Matryoshki

Etienne Grossmann

PDF

Open Access

TL;DR

This paper develops a theoretical framework for boosting probabilistic classifiers, introducing a novel nested tree algorithm called matryoshka, and analyzes its training error bounds to demonstrate improved efficiency over simple decision trees.

Contribution

It introduces the matryoshka algorithm, a nested tree boosting method, and provides a theoretical analysis showing its superior efficiency with probabilistic weak classifiers.

Findings

01

Matryoshka leverages probabilistic weak classifiers more efficiently.

02

Theoretical training error bounds are established for the algorithms.

03

Comparison of boosting methods in probabilistic classification contexts.

Abstract

We present a theory of boosting probabilistic classifiers. We place ourselves in the situation of a user who only provides a stopping parameter and a probabilistic weak learner/classifier and compare three types of boosting algorithms: probabilistic Adaboost, decision tree, and tree of trees of ... of trees, which we call matryoshka. "Nested tree," "embedded tree" and "recursive tree" are also appropriate names for this algorithm, which is one of our contributions. Our other contribution is the theoretical analysis of the algorithms, in which we give training error bounds. This analysis suggests that the matryoshka leverages probabilistic weak classifiers more efficiently than simple decision trees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Bayesian Modeling and Causal Inference · Imbalanced Data Classification Techniques