Ranking Perspective for Tree-based Methods with Applications to Symbolic Feature Selection
Hengrui Luo, Meng Li

TL;DR
This paper offers a finite-sample analysis of tree-based methods from a ranking perspective, revealing their ability to distinguish symbolic features and providing new theoretical insights into their performance in regression and feature selection.
Contribution
It introduces a novel ranking perspective for analyzing tree-based methods, extending theoretical understanding and proposing new divergence statistics for symbolic feature evaluation.
Findings
Finite-sample oracle bounds for trees and ensembles
Ranking consistency and posterior contraction results
Effective divergence statistics for symbolic feature selection
Abstract
Tree-based methods are powerful nonparametric techniques in statistics and machine learning. However, their effectiveness, particularly in finite-sample settings, is not fully understood. Recent applications have revealed their surprising ability to distinguish transformations (which we call symbolic feature selection) that remain obscure under current theoretical understanding. This work provides a finite-sample analysis of tree-based methods from a ranking perspective. We link oracle partitions in tree methods to response rankings at local splits, offering new insights into their finite-sample behavior in regression and feature selection tasks. Building on this local ranking perspective, we extend our analysis in two ways: (i) We examine the global ranking performance of individual trees and ensembles, including Classification and Regression Trees (CART) and Bayesian Additive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Rough Sets and Fuzzy Logic
MethodsFeature Selection
