Non-Data-Splitting Estimator Selection for Regression in Exponential   Families

Juntong Chen

arXiv:2212.12954·stat.ME·February 12, 2025

Non-Data-Splitting Estimator Selection for Regression in Exponential Families

Juntong Chen

PDF

Open Access

TL;DR

This paper introduces a data-efficient estimator selection method for regression in exponential families that avoids data splitting, providing theoretical guarantees and demonstrating practical effectiveness in changepoint detection and simulations.

Contribution

It proposes a novel non-data-splitting estimator selection procedure with theoretical risk bounds for exponential family regression models.

Findings

01

The method achieves competitive risk bounds without data splitting.

02

It effectively detects changepoints in exponential family models.

03

Simulation and real data show practical advantages over traditional methods.

Abstract

We observe $n$ independent pairs of random variables $(W_{i}, Y_{i})$ , where the conditional distribution of $Y_{i}$ given $W_{i} = w_{i}$ follows a one-parameter exponential family with parameter $\bsg^{*} (w_{i}) \in R$ . Our goal is to estimate the regression function $\bsg^{*}$ . We start with an arbitrary collection of piecewise constant candidate estimators based on our observations and, using the same data, select an estimator from this collection. Our approach is agnostic to the dependencies of the candidate estimators on the data, differing from methods like data splitting, cross-validation, and hold-out. To demonstrate its theoretical performance, we provide a non-asymptotic risk bound for the selected estimator. We then explain how to apply the procedure to changepoint detection in exponential families. The practical performance of the proposed approach is illustrated through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Causal Inference Techniques · Gene expression and cancer classification