Building Consistent Regression Trees From Complex Sample Data
Daniell Toth, John Eltinge

TL;DR
This paper introduces a method for constructing consistent regression trees that account for complex sample designs, ensuring reliable estimation in survey data analysis.
Contribution
It proposes a novel approach to incorporate complex sample design information into recursive partitioning algorithms for regression trees.
Findings
Establishes conditions for asymptotic design L2 consistency.
Demonstrates the method with real survey data.
Shows improved estimator performance through simulation.
Abstract
In the past several years a wide range of methods for the construction of regression trees and other estimators based on the recursive partitioning of samples have appeared in the statistics literature. Many applications involve data collected through a complex sample design. At present, however, relatively little is known regarding the properties of these methods under complex designs. This article proposes a method for incorporating information about the complex sample design when building a regression tree using a recursive partitioning algorithm. Sufficient conditions are established for asymptotic design L 2 consistency of these regression trees as estimators for an arbitrary regression function. The proposed method is illustrated with Occupational Employment Statistics establishment survey data linked to Quarterly Census of Employment and Wage payroll data of the Bureau of Labor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Bayesian Methods and Mixture Models · Statistical Methods and Inference
