ML-Based Optimum Sub-system Size Heuristic for the GPU Implementation of the Tridiagonal Partition Method

Milena Veneva

arXiv:2510.27351·cs.DC·May 22, 2026

ML-Based Optimum Sub-system Size Heuristic for the GPU Implementation of the Tridiagonal Partition Method

Milena Veneva

PDF

TL;DR

This paper develops a machine learning heuristic using k-nearest neighbors to optimize sub-system sizes in GPU implementations of the parallel partition algorithm, improving efficiency for solving linear algebraic systems.

Contribution

It introduces a novel ML-based heuristic for selecting optimal sub-system sizes in GPU algorithms, including recursive steps, enhancing performance over empirical methods.

Findings

01

kNN classification effectively predicts optimal sub-system sizes.

02

Statistical analysis confirms the heuristic's accuracy.

03

The approach extends to recursive parallel partition algorithms.

Abstract

This paper presents a machine learning (ML)-based heuristic for finding the optimum sub-system size for the CUDA implementation of the parallel partition algorithm. Computational experiments for different system of linear algebraic equation (SLAE) sizes are conducted, and the optimum sub-system size for each of them is found empirically. To estimate a model for the sub-system size, we perform the k-nearest neighbors (kNN) classification method. Statistical analysis of the results is done. By comparing the predicted values with the actual data, the algorithm is deemed to be acceptably good. Next, the heuristic is expanded to work for the recursive parallel partition algorithm as well. An algorithm for determining the optimum sub-system size for each recursive step is formulated. A kNN model for predicting the optimum number of recursive steps for a particular SLAE size is built.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and FPGA Design Techniques · Matrix Theory and Algorithms · Interconnection Networks and Systems