Data-efficient Performance Modeling via Pre-training

Chunting Liu; Riyadh Baghdadi

arXiv:2501.14438·cs.PL·January 27, 2025

Data-efficient Performance Modeling via Pre-training

Chunting Liu, Riyadh Baghdadi

PDF

Open Access

TL;DR

This paper presents a self-supervised pre-training approach using autoencoders to significantly reduce labeled data requirements for performance modeling in code optimization, achieving comparable accuracy with less data.

Contribution

Introducing a pre-training scheme with autoencoders that enhances performance model accuracy while drastically reducing the need for extensive labeled datasets.

Findings

01

Achieves similar performance with 5x less data.

02

Reduces data collection time and cost.

03

Improves model accuracy in code performance prediction.

Abstract

Performance models are essential for automatic code optimization, enabling compilers to predict the effects of code transformations on performance and guide search for optimal transformations. Building state-of-the-art performance models with deep learning, however, requires vast labeled datasets of random programs -- an expensive and time-consuming process, stretching over months. This paper introduces a self-supervised pre-training scheme with autoencoders to reduce the need for labeled data. By pre-training on a large dataset of random programs, the autoencoder learns representations of code and transformations, which are then used to embed programs for the performance model. Implemented in the Tiramisu autoscheduler, our approach improves model accuracy with less data. For example, to achieve a MAPE of 20.72%, the original model requires 18 million data points, whereas our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability