BB-ML: Basic Block Performance Prediction using Machine Learning Techniques
Hamdy Abdelkhalik, Shamminuj Aktar, Yehia Arafa, Atanu Barai, Gopinath, Chennupati, Nandakishore Santhi, Nishant Panda, Nirmal Prajapati, Nazmul, Haque Turja, Stephan Eidenbenz, Abdel-Hameed Badawy

TL;DR
This paper introduces a machine learning approach using a Poisson Neural Network to accurately predict GPU basic block performance and application metrics at larger input sizes, enabling detailed performance analysis and extrapolation.
Contribution
It presents a novel ML-based method for fine-grained GPU performance prediction at the basic block level, including extrapolation from small to large inputs, with high accuracy.
Findings
Achieved 93.5% accuracy in basic block count extrapolation.
Predicted performance metrics with less than 1% error for memory requests.
Effectively estimated functional unit utilization with errors below 11%.
Abstract
Recent years have seen the adoption of Machine Learning (ML) techniques to predict the performance of large-scale applications, mostly at a coarse level. In contrast, we propose to use ML techniques for performance prediction at a much finer granularity, namely at the Basic Block (BB) level, which are single entry, single exit code blocks that are used for analysis by the compilers to break down a large code into manageable pieces. We extrapolate the basic block execution counts of GPU applications and use them for predicting the performance for large input sizes from the counts of smaller input sizes. We train a Poisson Neural Network (PNN) model using random input values as well as the lowest input values of the application to learn the relationship between inputs and basic block counts. Experimental results show that the model can accurately predict the basic block execution counts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
