Efficient algorithms for decision tree cross-validation

Hendrik Blockeel; Jan Struyf

arXiv:cs/0110036·cs.LG·May 23, 2007·130 cites

Efficient algorithms for decision tree cross-validation

Hendrik Blockeel, Jan Struyf

PDF

Open Access

TL;DR

This paper presents a method to significantly reduce the computational overhead of cross-validation in decision tree algorithms by integrating it with the tree induction process, supported by theoretical analysis and experiments.

Contribution

It introduces an integrated approach to decision tree cross-validation that decreases computational costs compared to traditional methods.

Findings

01

Significant speedups in cross-validation for decision trees.

02

Adaptations of existing algorithms enable more efficient validation.

03

Experimental results confirm theoretical speedup estimates.

Abstract

Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. The analysis is supported by experimental results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Machine Learning and Data Classification · Imbalanced Data Classification Techniques