Non-splitting Neyman-Pearson Classifiers
Jingming Wang, Lucy Xia, Zhigang Bao, Xin Tong

TL;DR
This paper introduces a novel non-splitting Neyman-Pearson classifier that improves data utilization and reduces type II error by avoiding sample splitting, based on a linear discriminant analysis model and a new CLT for quadratic forms.
Contribution
It develops the first Neyman-Pearson classifier without sample splitting, leveraging a CLT for quadratic forms in a linear discriminant analysis framework.
Findings
Numerical experiments show improved performance over split-sample methods.
The new classifier maintains type I error control with higher power.
Theoretical results provide a foundation for non-splitting NP classifiers.
Abstract
The Neyman-Pearson (NP) binary classification paradigm constrains the more severe type of error (e.g., the type I error) under a preferred level while minimizing the other (e.g., the type II error). This paradigm is suitable for applications such as severe disease diagnosis, fraud detection, among others. A series of NP classifiers have been developed to guarantee the type I error control with high probability. However, these existing classifiers involve a sample splitting step: a mixture of class 0 and class 1 observations to construct a scoring function and some left-out class 0 observations to construct a threshold. This splitting enables classifier construction built upon independence, but it amounts to insufficient use of data for training and a potentially higher type II error. Leveraging a canonical linear discriminant analysis model, we derive a quantitative CLT for a certain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Face and Expression Recognition · Statistical Methods and Inference
