Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems
Yang You, James Demmel, Cho-Jui Hsieh, Richard Vuduc

TL;DR
This paper introduces two scalable methods, BKRR and KKRR, for kernel ridge regression that significantly improve weak scaling efficiency and speed up training on parallel and distributed systems.
Contribution
The paper presents two novel partitioning-based methods, BKRR and KKRR, that enhance weak scaling and speed for kernel ridge regression on large-scale distributed systems.
Findings
KKRR2 improves weak scaling efficiency from 0.32% to 38%.
KKRR2 achieves a 591x speedup for the same accuracy.
BKRR2 achieves up to 92% weak scaling efficiency and 3505x speedup for approximate solutions.
Abstract
We propose two new methods to address the weak scaling problems of KRR: the Balanced KRR (BKRR) and K-means KRR (KKRR). These methods consider alternative ways to partition the input dataset into p different parts, generating p different models, and then selecting the best model among them. Compared to a conventional implementation, KKRR2 (optimized version of KKRR) improves the weak scaling efficiency from 0.32% to 38% and achieves a 591times speedup for getting the same accuracy by using the same data and the same hardware (1536 processors). BKRR2 (optimized version of BKRR) achieves a higher accuracy than the current fastest method using less training time for a variety of datasets. For the applications requiring only approximate solutions, BKRR2 improves the weak scaling efficiency to 92% and achieves 3505 times speedup (theoretical speedup: 4096 times).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
