Towards Power Efficient DNN Accelerator Design on Reconfigurable Platform
Rourab Paul, Sreetama Sarkar, Suman Sau, Koushik Chakraborty,, Sanghamitra Roy, Amlan Chakrabarti

TL;DR
This paper presents an ultra low power FPGA implementation of a TPU for edge applications, using voltage scaling and partitioning strategies to enhance energy efficiency while maintaining performance.
Contribution
It introduces a novel FPGA partitioning and voltage biasing scheme for TPU acceleration, enabling energy savings through static and runtime calibration methods.
Findings
Significant power reduction demonstrated in FPGA TPU implementation.
Effective partitioning based on slack values improves timing and energy efficiency.
Simulation results confirm the viability of voltage scaled TPU in FPGA platforms.
Abstract
The exponential emergence of Field Programmable Gate Array (FPGA) has accelerated the research of hardware implementation of Deep Neural Network (DNN). Among all DNN processors, domain specific architectures, such as, Google's Tensor Processor Unit (TPU) have outperformed conventional GPUs. However, implementation of TPUs in reconfigurable hardware should emphasize energy savings to serve the green computing requirement. Voltage scaling, a popular approach towards energy savings, can be a bit critical in FPGA as it may cause timing failure if not done in an appropriate way. In this work, we present an ultra low power FPGA implementation of a TPU for edge applications. We divide the systolic-array of a TPU into different FPGA partitions, where each partition uses different near threshold (NTC) biasing voltages to run its FPGA cores. The biasing voltage for each partition is roughly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Parallel Computing and Optimization Techniques · Low-power high-performance VLSI design
