Grid Point Approximation for Distributed Nonparametric Smoothing and   Prediction

Yuan Gao; Rui Pan; Feng Li; Riquan Zhang; Hansheng Wang

arXiv:2409.14079·stat.CO·October 8, 2024·J. Comput. Graph. Stat.

Grid Point Approximation for Distributed Nonparametric Smoothing and Prediction

Yuan Gao, Rui Pan, Feng Li, Riquan Zhang, Hansheng Wang

PDF

TL;DR

This paper introduces a grid point approximation method for distributed kernel smoothing that is computationally efficient, statistically effective, and applicable to non-random data distributions, improving prediction in large datasets.

Contribution

The paper proposes a novel GPA method that enhances distributed kernel smoothing by eliminating communication, maintaining statistical efficiency, and handling non-random data distributions.

Findings

01

GPA matches global estimator efficiency under mild conditions.

02

GPA requires no communication, greatly improving computational speed.

03

Two new bandwidth selectors are theoretically validated.

Abstract

Kernel smoothing is a widely used nonparametric method in modern statistical analysis. The problem of efficiently conducting kernel smoothing for a massive dataset on a distributed system is a problem of great importance. In this work, we find that the popularly used one-shot type estimator is highly inefficient for prediction purposes. To this end, we propose a novel grid point approximation (GPA) method, which has the following advantages. First, the resulting GPA estimator is as statistically efficient as the global estimator under mild conditions. Second, it requires no communication and is extremely efficient in terms of computation for prediction. Third, it is applicable to the case where the data are not randomly distributed across different machines. To select a suitable bandwidth, two novel bandwidth selectors are further developed and theoretically supported. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.