Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data
Kevin Buchin, Jeff M. Phillips, and Pingfan Tang

TL;DR
This paper develops methods to approximate the distribution of robust estimators like the median for uncertain data points with discrete distributions, enabling efficient analysis despite data uncertainty.
Contribution
It introduces algorithms to construct and estimate the distribution of robust estimators for uncertain data, including a general approximation technique for high-dimensional estimators.
Findings
Near-linear time for support construction
Quadratic time for probability assignment
Applicable to high-dimensional median and regression estimators
Abstract
Robust estimators, like the median of a point set, are important for data analysis in the presence of outliers. We study robust estimators for locationally uncertain points with discrete distributions. That is, each point in a data set has a discrete probability distribution describing its location. The probabilistic nature of uncertain data makes it challenging to compute such estimators, since the true value of the estimator is now described by a distribution rather than a single point. We show how to construct and estimate the distribution of the median of a point set. Building the approximate support of the distribution takes near-linear time, and assigning probability to that support takes quadratic time. We also develop a general approximation technique for distributions of robust estimators with respect to ranges with bounded VC dimension. This includes the geometric median for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
