Towards making the most of NLP-based device mapping optimization for OpenCL kernels
Petros Vavaroutsos, Ioannis Oroutzoglou, Dimosthenis Masouros,, Dimitrios Soudris

TL;DR
This paper enhances NLP-based machine learning models for optimizing device selection in OpenCL kernels, achieving up to 4% better accuracy than previous methods by addressing key limitations.
Contribution
It introduces four new DNN models that incorporate richer source code context, improving device prediction accuracy over prior work in NLP-driven autotuning.
Findings
Up to 4% improvement in prediction accuracy.
Identification of limitations in existing Deeptune approach.
Enhanced models provide better contextual understanding.
Abstract
Nowadays, we are living in an era of extreme device heterogeneity. Despite the high variety of conventional CPU architectures, accelerator devices, such as GPUs and FPGAs, also appear in the foreground exploding the pool of available solutions to execute applications. However, choosing the appropriate device per application needs is an extremely challenging task due to the abstract relationship between hardware and software. Automatic optimization algorithms that are accurate are required to cope with the complexity and variety of current hardware and software. Optimal execution has always relied on time-consuming trial and error approaches. Machine learning (ML) and Natural Language Processing (NLP) has flourished over the last decade with research focusing on deep architectures. In this context, the use of natural language processing techniques to source code in order to conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
