Nonparametric Regression for Random Unbiased Perturbations
Anna Lyubarskaja, Dominik Rothenh\"ausler

TL;DR
This paper investigates nonparametric regression under random unbiased perturbations of the conditional distribution, revealing how such perturbations inflate variance, alter bandwidth selection, and impact statistical efficiency and limits.
Contribution
It introduces the concept of RUPs, derives an extended bias-variance decomposition including distributional variance, and establishes minimax bounds showing the fundamental impact of dataset-level perturbations.
Findings
Distributional uncertainty reduces effective sample size to n/(1 + nτ)
Optimal bandwidth scales as τ^{1/(2β+1)} under dominant distributional uncertainty
Minimax bounds demonstrate the fundamental limits imposed by RUPs
Abstract
We study nonparametric regression with covariates and outcome under random unbiased perturbations (RUPs) of the conditional distribution , where the marginal distribution of covariates, , remains fixed but the conditional law, , varies randomly across datasets. Unlike adversarial distribution shift frameworks that yield conservative worst-case guarantees, RUPs induce dataset-level variance inflation rather than systematic bias. We provide examples of RUPs and show that this distributional uncertainty reduces the effective sample size to , where quantifies the perturbation strength. For local polynomial estimators, we derive an extended bias-variance decomposition that includes a distributional variance term with the same bandwidth scaling as classical sampling variance. This leads to a modified bandwidth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Stochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data
