Optimally Pruning Decision Tree Ensembles With Feature Cost
Feng Nan, Joseph Wang, Venkatesh Saligrama

TL;DR
This paper introduces a novel method for pruning decision tree ensembles to minimize feature costs while maintaining high accuracy, using a linear programming approach for optimal solutions.
Contribution
It presents a general integer programming formulation for ensemble pruning that explicitly considers feature sharing and cost-accuracy trade-offs, enabling optimal pruning under budget constraints.
Findings
Significantly reduces feature costs in ensemble models.
The LP relaxation yields exact solutions efficiently.
Improves performance over the state-of-the-art BudgetRF method.
Abstract
We consider the problem of learning decision rules for prediction with feature budget constraint. In particular, we are interested in pruning an ensemble of decision trees to reduce expected feature cost while maintaining high prediction accuracy for any test example. We propose a novel 0-1 integer program formulation for ensemble pruning. Our pruning formulation is general - it takes any ensemble of decision trees as input. By explicitly accounting for feature-sharing across trees together with accuracy/cost trade-off, our method is able to significantly reduce feature cost by pruning subtrees that introduce more loss in terms of feature cost than benefit in terms of prediction accuracy gain. Theoretically, we prove that a linear programming relaxation produces the exact solution of the original integer program. This allows us to use efficient convex optimization tools to obtain an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
