FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate   Representations

Changan Niu; Chuanyi Li; Vincent Ng; David Lo; Bin Luo

arXiv:2309.04828·cs.SE·September 12, 2023

FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations

Changan Niu, Chuanyi Li, Vincent Ng, David Lo, Bin Luo

PDF

Open Access 1 Repo

TL;DR

FAIR is a novel pre-training model for compiler IRs that effectively captures flow types and long-distance dependencies, improving performance on code-related tasks.

Contribution

Introduces a flow type-aware pre-training approach for IRs using a graph transformer and new tasks to better understand IR semantics.

Findings

01

Achieves state-of-the-art results on four downstream tasks.

02

Effectively models flow types and long-range dependencies.

03

Addresses over-smoothing and over-squashing in IR graph learning.

Abstract

While the majority of existing pre-trained models from code learn source code features such as code tokens and abstract syntax trees, there are some other works that focus on learning from compiler intermediate representations (IRs). Existing IR-based models typically utilize IR features such as instructions, control and data flow graphs (CDFGs), call graphs, etc. However, these methods confuse variable nodes and instruction nodes in a CDFG and fail to distinguish different types of flows, and the neural networks they use fail to capture long-distance dependencies and have over-smoothing and over-squashing problems. To address these weaknesses, we propose FAIR, a Flow type-Aware pre-trained model for IR that involves employing (1) a novel input representation of IR programs; (2) Graph Transformer to address over-smoothing, over-squashing and long-dependencies problems; and (3) five…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nougatca/fair
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software System Performance and Reliability