How Many Machines Can We Use in Parallel Computing for Kernel Ridge Regression?
Meimei Liu, Zuofeng Shang, Guang Cheng

TL;DR
This paper investigates the optimal number of machines for parallel kernel ridge regression, providing theoretical bounds and a unified empirical process framework applicable to various nonparametric regression problems.
Contribution
It establishes the range of machine counts for optimal estimation and testing in kernel ridge regression, with bounds proven to be nearly optimal in key cases.
Findings
Identifies the maximum number of machines for optimal performance
Provides a unified empirical process framework for diverse regression problems
Proves bounds are nearly optimal in smoothing spline and Gaussian RKHS regression
Abstract
This paper aims to solve a basic problem in distributed statistical inference: how many machines can we use in parallel computing? In kernel ridge regression, we address this question in two important settings: nonparametric estimation and hypothesis testing. Specifically, we find a range for the number of machines under which optimal estimation/testing is achievable. The employed empirical processes method provides a unified framework, that allows us to handle various regression problems (such as thin-plate splines and nonparametric additive regression) under different settings (such as univariate, multivariate and diverging-dimensional designs). It is worth noting that the upper bounds of the number of machines are proven to be un-improvable (upto a logarithmic factor) in two important cases: smoothing spline regression and Gaussian RKHS regression. Our theoretical findings are backed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques
