EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Chang Gao; Antonio Rios-Navarro; Xi Chen; Tobi Delbruck; Shih-Chii Liu

arXiv:1912.12193·eess.SP·July 30, 2020·AICAS

EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Chang Gao, Antonio Rios-Navarro, Xi Chen, Tobi Delbruck, Shih-Chii Liu

PDF

TL;DR

EdgeDRNN is a low-latency, energy-efficient RNN accelerator for edge devices that exploits temporal sparsity to significantly reduce memory access and achieve high performance with minimal power.

Contribution

The paper introduces EdgeDRNN, a novel RNN accelerator leveraging delta network algorithms for temporal sparsity, enabling fast, low-power edge inference.

Findings

01

Reduces off-chip memory access by up to 10x

02

Achieves under 0.5 ms inference time for a 10 million parameter GRU-RNN

03

Outperforms NVIDIA Jetson Nano, TX2, and Intel Neural Compute Stick 2 in latency by 6X

Abstract

This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.