Estimating leverage scores via rank revealing methods and randomization

Aleksandros Sobczyk (1); Efstratios Gallopoulos (2) ((1) IBM; Research Europe; Zurich; Switzerland (2) Computer Engineering; Informatics; Department; University of Patras; Greece)

arXiv:2105.11004·cs.DS·March 8, 2022

Estimating leverage scores via rank revealing methods and randomization

Aleksandros Sobczyk (1), Efstratios Gallopoulos (2) ((1) IBM, Research Europe, Zurich, Switzerland (2) Computer Engineering, Informatics, Department, University of Patras, Greece)

PDF

1 Repo

TL;DR

This paper introduces new algorithms that efficiently estimate leverage scores of matrices using rank revealing and randomized methods, applicable to both full-rank and rank-deficient data, with strong theoretical and empirical validation.

Contribution

The paper presents novel fast algorithms for rank estimation, column subset selection, and leverage score estimation that work effectively on arbitrary rank matrices, including rank-deficient cases.

Findings

01

Algorithms achieve accurate leverage score estimates with improved complexity.

02

Extensive experiments demonstrate superior performance on synthetic and real data.

03

The methods provide meaningful approximation bounds and outperform existing techniques.

Abstract

We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms. We first develop a set of fast novel algorithms for rank estimation, column subset selection and least squares preconditioning. We then describe the design and implementation of leverage score estimators based on these primitives. These estimators are also effective for rank deficient input, which is frequently the case in data analytics applications. We provide detailed complexity analyses for all algorithms as well as meaningful approximation bounds and comparisons with the state-of-the-art. We conduct extensive numerical experiments to evaluate our algorithms and to illustrate their properties and performance using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IBM/pylspack
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.