Kernel Density Estimation with Berkson Error
James P. Long, Noureddine El Karoui, John A. Rice

TL;DR
This paper develops kernel density estimators for the convolution of a true density with known Berkson error, compares bandwidth selection methods, and introduces a data-driven bandwidth estimator with applications in epidemiology.
Contribution
It introduces a new approach for bandwidth selection in Berkson error density estimation and analyzes its performance both asymptotically and through simulations.
Findings
Optimal bandwidth depends on the error structure.
Smoothing is crucial when the error density is concentrated near zero.
The proposed data-driven estimator performs well on NO₂ exposure data.
Abstract
Given a sample from , we construct kernel density estimators for , the convolution of with a known error density . This problem is known as density estimation with Berkson error and has applications in epidemiology and astronomy. Little is understood about bandwidth selection for Berkson density estimation. We compare three approaches to selecting the bandwidth both asymptotically, using large sample approximations to the MISE, and at finite samples, using simulations. Our results highlight the relationship between the structure of the error and the optimal bandwidth. In particular, the results demonstrate the importance of smoothing when the error term is concentrated near 0. We propose a data--driven bandwidth estimator and test its performance on NO exposure data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models
