InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNs
Zifan He, Anderson Truong, Yingqi Cao, Jason Cong

TL;DR
InTAR is an FPGA accelerator that dynamically switches execution patterns to efficiently handle high data volume variation in DNN tasks, achieving significant speedups and resource savings.
Contribution
It introduces a novel reconfigurable FPGA accelerator design that encodes static reconfiguration schedules for optimized multi-task DNN processing.
Findings
InTAR achieves 1.8x to 7.1x speedups over traditional accelerators.
It outperforms SoTA FPGA accelerators on GPT-2 medium by up to 39.14x.
InTAR demonstrates 1.66x to 7.17x better power efficiency than GPUs.
Abstract
The rise of deep neural networks (DNNs) has driven an increased demand for computing power and memory. Modern DNNs exhibit high data volume variation (HDV) across tasks, which poses challenges for FPGA acceleration: conventional accelerators rely on fixed execution patterns (dataflow or sequential) that can lead to pipeline stalls or necessitate frequent off-chip memory accesses. To address these challenges, we introduce the Inter-Task Auto-Reconfigurable Accelerator (InTAR), a novel accelerator design methodology for HDV applications on FPGAs. InTAR combines the high computational efficiency of sequential execution with the reduced off-chip memory overhead of dataflow execution. It switches execution patterns automatically with a static schedule determined before circuit design based on resource constraints and problem sizes. Unlike previous reconfigurable accelerators, InTAR encodes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Layer Normalization · Residual Connection · Linear Layer · Dense Connections · Multi-Head Attention · Adam · Softmax
