Uncertainty quantification for distributed regression

Valeriy Avanesov

arXiv:2105.11425·stat.ML·May 25, 2021

Uncertainty quantification for distributed regression

Valeriy Avanesov

PDF

Open Access

TL;DR

This paper introduces a data-driven method to quantify uncertainty in distributed regression estimators, providing confidence bands and theoretical guarantees for large-scale Kernel Ridge Regression.

Contribution

It develops a novel approach for uncertainty quantification in divide-and-conquer regression, with rigorous theoretical guarantees for a broad class of learners.

Findings

01

Constructs simultaneous confidence bands for distributed estimators

02

Provides sup-norm consistency results for divide-and-conquer Kernel Ridge Regression

03

Simulation studies support theoretical guarantees

Abstract

The ever-growing size of the datasets renders well-studied learning techniques, such as Kernel Ridge Regression, inapplicable, posing a serious computational challenge. Divide-and-conquer is a common remedy, suggesting to split the dataset into disjoint partitions, obtain the local estimates and average them, it allows to scale-up an otherwise ineffective base approach. In the current study we suggest a fully data-driven approach to quantify uncertainty of the averaged estimator. Namely, we construct simultaneous element-wise confidence bands for the predictions yielded by the averaged estimator on a given deterministic prediction set. The novel approach features rigorous theoretical guaranties for a wide class of base learners with Kernel Ridge regression being a special case. As a by-product of our analysis we also obtain a sup-norm consistency result for the divide-and-conquer Kernel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Distributed Sensor Networks and Detection Algorithms · Statistical Methods and Inference