Technical Report about Tiramisu: a Three-Layered Abstraction for Hiding Hardware Complexity from DSL Compilers
Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo,, Patricia Suriana, Shoaib Kamil, Saman Amarasinghe

TL;DR
Tiramisu introduces a three-layer IR for DSL compilers that simplifies optimization and enables efficient, multi-architecture code generation, significantly reducing complexity while maintaining high performance.
Contribution
It presents Tiramisu, a novel three-level IR that separates algorithm, execution, and data storage, facilitating multi-architecture code generation from a common middle-end.
Findings
Enabled new algorithm expressions like recurrent filters.
Achieved complex loop transformations such as wavefront parallelization.
Generated code with up to 16X speedup over existing implementations.
Abstract
High-performance DSL developers work hard to take advantage of modern hardware. The DSL compilers have to build their own complex middle-ends before they can target a common back-end such as LLVM, which only handles single instruction streams with SIMD instructions. We introduce Tiramisu, a common middle-end that can generate efficient code for modern processors and accelerators such as multicores, GPUs, FPGAs and distributed clusters. Tiramisu introduces a novel three-level IR that separates the algorithm, how that algorithm is executed, and where intermediate data are stored. This separation simplifies optimization and makes targeting multiple hardware architectures from the same algorithm easier. As a result, DSL compilers can be made considerably less complex with no loss of performance while immediately targeting multiple hardware or hardware combinations such as distributed nodes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Embedded Systems Design Techniques
