Codestitcher: Inter-Procedural Basic Block Layout Optimization
Rahman Lavaee, John Criswell, Chen Ding

TL;DR
Codestitcher is an inter-procedural code layout optimizer that significantly improves cache and TLB performance, leading to 3-25% speedups on large applications by reordering basic blocks hierarchically.
Contribution
It introduces Codestitcher, a novel hierarchical framework for inter-procedural basic block reordering that outperforms existing code layout optimization techniques.
Findings
Average performance improvement of 10% across tested applications.
Additional 4% improvement over LLVM's PGO.
Further 3% gain when combined with the best function reordering.
Abstract
Modern software executes a large amount of code. Previous techniques of code layout optimization were developed one or two decades ago and have become inadequate to cope with the scale and complexity of new types of applications such as compilers, browsers, interpreters, language VMs and shared libraries. This paper presents Codestitcher, an inter-procedural basic block code layout optimizer which reorders basic blocks in an executable to benefit from better cache and TLB performance. Codestitcher provides a hierarchical framework which can be used to improve locality in various layers of the memory hierarchy. Our evaluation shows that Codestitcher improves the performance of the original program by 3\% to 25\% (on average, by 10\%) on 5 widely used applications with large code sizes: MySQL, Clang, Firefox, Apache, and Python. It gives an additional improvement of 4\% over LLVM's PGO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
