Compiling Halide Programs to Push-Memory Accelerators
Qiaoyi Liu, Dillon Huff, Jeff Setter, Maxwell Strange, Kathleen Feng,, Kavya Sreedhar, Ziheng Wang, Keyi Zhang, Mark Horowitz, Priyanka Raina, and, Fredrik Kjolstad

TL;DR
This paper introduces a compiler approach for programmable push-memory accelerators, combining memory and control logic into a unified buffer, enabling efficient compilation of image processing and machine learning applications with significant performance gains.
Contribution
It proposes a novel compiler abstraction using unified buffers for push memories and a memory mapping algorithm combining polyhedral analysis and vectorization.
Findings
Achieves 4.7x better runtime than FPGA
Attains 4.3x better energy efficiency
Supports a wide range of applications
Abstract
Image processing and machine learning applications benefit tremendously from hardware acceleration, but existing compilers target either FPGAs, which sacrifice power and performance for flexible hardware, or ASICs, which rapidly become obsolete as applications change. Programmable domain-specific accelerators have emerged as a promising middle-ground between these two extremes, but such architectures have traditionally been difficult compiler targets. The main obstacle is that these accelerators often use a different memory abstraction than CPUs and GPUs: push memories that send a data stream from one computation kernel to other kernels, possibly reordered. To address the compilation challenges caused by push memories, we propose that the representation of memory in the middle and backend of the compiler be altered to combine storage with address generation and control logic in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · CCD and CMOS Imaging Sensors
