ParK: Sound and Efficient Kernel Ridge Regression by Feature Space   Partitions

Luigi Carratino; Stefano Vigogna; Daniele Calandriello; Lorenzo; Rosasco

arXiv:2106.12231·stat.ML·October 18, 2022

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo, Rosasco

PDF

Open Access 1 Video

TL;DR

ParK is a scalable kernel ridge regression method that uses feature space partitions, random projections, and iterative optimization to reduce computational costs while maintaining statistical accuracy, suitable for large datasets.

Contribution

It introduces a novel partitioning approach directly in feature space combined with random projections, improving efficiency and accuracy in large-scale kernel ridge regression.

Findings

01

Reduces space and time complexity significantly.

02

Maintains statistical accuracy comparable to traditional methods.

03

Effective on large-scale datasets in experiments.

Abstract

We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks