Non-Data-Splitting Estimator Selection for Regression in Exponential Families
Juntong Chen

TL;DR
This paper introduces a data-efficient estimator selection method for regression in exponential families that avoids data splitting, providing theoretical guarantees and demonstrating practical effectiveness in changepoint detection and simulations.
Contribution
It proposes a novel non-data-splitting estimator selection procedure with theoretical risk bounds for exponential family regression models.
Findings
The method achieves competitive risk bounds without data splitting.
It effectively detects changepoints in exponential family models.
Simulation and real data show practical advantages over traditional methods.
Abstract
We observe independent pairs of random variables , where the conditional distribution of given follows a one-parameter exponential family with parameter . Our goal is to estimate the regression function . We start with an arbitrary collection of piecewise constant candidate estimators based on our observations and, using the same data, select an estimator from this collection. Our approach is agnostic to the dependencies of the candidate estimators on the data, differing from methods like data splitting, cross-validation, and hold-out. To demonstrate its theoretical performance, we provide a non-asymptotic risk bound for the selected estimator. We then explain how to apply the procedure to changepoint detection in exponential families. The practical performance of the proposed approach is illustrated through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Causal Inference Techniques · Gene expression and cancer classification
