FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only
Jinhwan Park, Wonyong Sung

TL;DR
This paper presents an FPGA-based fixed-point deep neural network system that uses only on-chip memory, achieving higher efficiency and lower power consumption than GPU implementations for tasks like MNIST digit recognition.
Contribution
The work introduces a novel FPGA implementation of DNNs with on-chip memory only, using 3-bit weights and fixed-point training for improved efficiency and reduced power consumption.
Findings
Speed is about 25% of GPU implementation.
Power consumption is less than 5 Watts.
System outperforms PC-based implementations.
Abstract
Deep neural networks (DNNs) demand a very large amount of computation and weight storage, and thus efficient implementation using special purpose hardware is highly desired. In this work, we have developed an FPGA based fixed-point DNN system using only on-chip memory not to access external DRAM. The execution time and energy consumption of the developed system is compared with a GPU based implementation. Since the capacity of memory in FPGA is limited, only 3-bit weights are used for this implementation, and training based fixed-point weight optimization is employed. The implementation using Xilinx XC7Z045 is tested for the MNIST handwritten digit recognition benchmark and a phoneme recognition task on TIMIT corpus. The obtained speed is about one quarter of a GPU based implementation and much better than that of a PC based one. The power consumption is less than 5 Watt at the full…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
