Policy-Oriented Binary Classification: Improving (KD-)CART Final Splits for Subpopulation Targeting
Lei Bill Wang, Zhenbang Jiao, Fangyi Wang

TL;DR
This paper introduces new split rules for binary classification in policy targeting, outperforming traditional CART and KD-CART methods by better identifying vulnerable subpopulations, with proven optimality and consistency.
Contribution
The paper proposes MDFS, PFS, and wEFS methods that improve final split rules in LPC problems, surpassing CART and KD-CART in theoretical and empirical performance.
Findings
MDFS strictly dominates CART/KD-CART under certain assumptions.
Proposed methods outperform CART/KD-CART in simulations.
MDFS identifies more vulnerable subpopulations in real data.
Abstract
Policymakers often use recursive binary split rules to partition populations based on binary outcomes and target subpopulations whose probability of the binary event exceeds a threshold. We call such problems Latent Probability Classification (LPC). Practitioners typically employ Classification and Regression Trees (CART) for LPC. We prove that in the context of LPC, classic CART and the knowledge distillation method, whose student model is a CART (referred to as KD-CART), are suboptimal. We propose Maximizing Distance Final Split (MDFS), which generates split rules that strictly dominate CART/KD-CART under the unique intersect assumption. MDFS identifies the unique best split rule, is consistent, and targets more vulnerable subpopulations than CART/KD-CART. To relax the unique intersect assumption, we additionally propose Penalized Final Split (PFS) and weighted Empirical risk Final…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Health, Environment, Cognitive Aging
MethodsKnowledge Distillation
