Contrast Trees and Distribution Boosting

Jerome H. Friedman

arXiv:1912.03785·stat.ML·May 25, 2022

Contrast Trees and Distribution Boosting

Jerome H. Friedman

PDF

TL;DR

This paper introduces contrast trees as a novel method to assess and improve the accuracy of machine learning estimates, especially when standard validation is inadequate, and presents distribution boosting for distribution estimation without assumptions.

Contribution

It proposes contrast trees for accuracy assessment and introduces distribution boosting as an assumption-free distribution estimation technique.

Findings

01

Contrast trees help evaluate the validity of machine learning results.

02

Boosted contrast trees can enhance prediction performance.

03

Distribution boosting estimates full probability distributions without assumptions.

Abstract

Often machine learning methods are applied and results reported in cases where there is little to no information concerning accuracy of the output. Simply because a computer program returns a result does not insure its validity. If decisions are to be made based on such results it is important to have some notion of their veracity. Contrast trees represent a new approach for assessing the accuracy of many types of machine learning estimates that are not amenable to standard (cross) validation methods. In situations where inaccuracies are detected boosted contrast trees can often improve performance. A special case, distribution boosting, provides an assumption free method for estimating the full probability distribution of an outcome variable given any set of joint input predictor variable values.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.