Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and   Distributed Systems

Yang You; James Demmel; Cho-Jui Hsieh; Richard Vuduc

arXiv:1805.00569·cs.DC·May 3, 2018

Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems

Yang You, James Demmel, Cho-Jui Hsieh, Richard Vuduc

PDF

TL;DR

This paper introduces two scalable methods, BKRR and KKRR, for kernel ridge regression that significantly improve weak scaling efficiency and speed up training on parallel and distributed systems.

Contribution

The paper presents two novel partitioning-based methods, BKRR and KKRR, that enhance weak scaling and speed for kernel ridge regression on large-scale distributed systems.

Findings

01

KKRR2 improves weak scaling efficiency from 0.32% to 38%.

02

KKRR2 achieves a 591x speedup for the same accuracy.

03

BKRR2 achieves up to 92% weak scaling efficiency and 3505x speedup for approximate solutions.

Abstract

We propose two new methods to address the weak scaling problems of KRR: the Balanced KRR (BKRR) and K-means KRR (KKRR). These methods consider alternative ways to partition the input dataset into p different parts, generating p different models, and then selecting the best model among them. Compared to a conventional implementation, KKRR2 (optimized version of KKRR) improves the weak scaling efficiency from 0.32% to 38% and achieves a 591times speedup for getting the same accuracy by using the same data and the same hardware (1536 processors). BKRR2 (optimized version of BKRR) achieves a higher accuracy than the current fastest method using less training time for a variety of datasets. For the applications requiring only approximate solutions, BKRR2 improves the weak scaling efficiency to 92% and achieves 3505 times speedup (theoretical speedup: 4096 times).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.