H2OPUS-TLR: High Performance Tile Low Rank Symmetric Factorizations using Adaptive Randomized Approximation
Wajih Boukaram, Stefano Zampini, George Turkiyyah, David, Keyes

TL;DR
This paper introduces a high-performance algorithm for tile low rank matrix factorizations, specifically Cholesky and LDL^T, optimized for GPUs and CPUs using adaptive randomized approximation and batching techniques.
Contribution
It develops a novel dynamic batching method combined with adaptive randomized approximations to efficiently factor TLR matrices on modern hardware.
Findings
Achieves over 1.2 TFLOP/s on V100 GPU for double precision operations.
Successfully factors large covariance matrices in seconds with acceptable accuracy.
Demonstrates potential for porting to tensor cores and other advanced hardware.
Abstract
Tile low rank representations of dense matrices partition them into blocks of roughly uniform size, where each off-diagonal tile is compressed and stored as its own low rank factorization. They offer an attractive representation for many data-sparse dense operators that appear in practical applications, where substantial compression and a much smaller memory footprint can be achieved. TLR matrices are a compromise between the simplicity of a regular perfectly-strided data structure and the optimal complexity of the unbalanced trees of hierarchically low rank matrices, and provide a convenient performance-tuning parameter through their tile size that can be proportioned to take into account the cache size where the tiles reside in the memory hierarchy. There are currently no high-performance algorithms that can generate Cholesky and factorizations, particularly on GPUs. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Electromagnetic Scattering and Analysis
