IR2Vec: LLVM IR based Scalable Program Embeddings

S. VenkataKeerthy; Rohit Aggarwal; Shalini Jain; Maunendra Sankar; Desarkar; Ramakrishna Upadrasta; Y. N. Srikant

arXiv:1909.06228·cs.PL·December 25, 2020

IR2Vec: LLVM IR based Scalable Program Embeddings

S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar, Desarkar, Ramakrishna Upadrasta, Y. N. Srikant

PDF

1 Repo

TL;DR

IR2Vec introduces a scalable, IR-based program embedding method that captures syntax and semantics, enabling faster training and improved performance in optimization tasks across multiple platforms.

Contribution

The paper presents IR2Vec, a novel IR-based embedding infrastructure with symbolic and flow-aware encodings, outperforming existing methods in speed and accuracy.

Findings

01

Outperforms existing methods in device mapping and thread coarsening tasks.

02

Enables faster training with non-sequential models.

03

Achieves state-of-the-art or improved results in benchmark suites.

Abstract

We propose IR2Vec, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the input programs. As our infrastructure is based on the Intermediate Representation (IR) of the source code, obtained embeddings are both language and machine independent. The entities of the IR are modeled as relationships, and their representations are learned to form a seed embedding vocabulary. Using this infrastructure, we propose two incremental encodings:Symbolic and Flow-Aware. Symbolic encodings are obtained from the seed embedding vocabulary, and Flow-Aware encodings are obtained by augmenting the Symbolic encodings with the flow information. We show the effectiveness of our methodology…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IITH-Compilers/IR2Vec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.