Efficiently estimating the error distribution in nonparametric regression with responses missing at random
Justin Chown, Ursula U. M\"uller

TL;DR
This paper develops an efficient method for estimating the error distribution in nonparametric regression models with missing responses, using complete case residuals, and proves its statistical properties including efficiency and asymptotic behavior.
Contribution
It introduces a complete case residual-based empirical distribution function that is shown to be efficient and asymptotically normal in nonparametric regression with missing data.
Findings
The residual-based empirical distribution function is efficient.
The estimator admits a functional central limit theorem.
Simulation results support the theoretical properties.
Abstract
This article considers nonparametric regression models with multivariate covariates and with responses missing at random. We estimate the regression function with a local polynomial smoother. The residual-based empirical distribution function that only uses complete cases, i.e. residuals that can actually be constructed from the data, is shown to be efficient in the sense of H\'ajek and Le Cam. In the proofs we derive, more generally, the efficient influence function for estimating an arbitrary linear functional of the error distribution; this covers the distribution function as a special case. We also show that the complete case residual-based empirical distribution function admits a functional central limit theorem. The article concludes with a small simulation study investigating the performance of the complete case residual-based empirical distribution function.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
