An Efficient FPGA-based Accelerator for Deep Forest
Mingyu Zhu, Jiapeng Luo, Wendong Mao, Zhongfeng Wang

TL;DR
This paper presents the first FPGA-based hardware accelerator for Deep Forest, significantly improving inference speed and efficiency compared to traditional CPU implementations.
Contribution
It introduces a novel FPGA architecture with a dedicated node computing unit and adaptive dataflow for Deep Forest models, optimizing performance and resource utilization.
Findings
Achieves approximately 40x speedup over a 40-core CPU.
Demonstrates improved hardware utilization and power efficiency.
Validates effectiveness on datasets like ADULT and Face Mask Detection.
Abstract
Deep Forest is a prominent machine learning algorithm known for its high accuracy in forecasting. Compared with deep neural networks, Deep Forest has almost no multiplication operations and has better performance on small datasets. However, due to the deep structure and large forest quantity, it suffers from large amounts of calculation and memory consumption. In this paper, an efficient hardware accelerator is proposed for deep forest models, which is also the first work to implement Deep Forest on FPGA. Firstly, a delicate node computing unit (NCU) is designed to improve inference speed. Secondly, based on NCU, an efficient architecture and an adaptive dataflow are proposed, in order to alleviate the problem of node computing imbalance in the classification process. Moreover, an optimized storage scheme in this design also improves hardware utilization and power efficiency. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Parallel Computing and Optimization Techniques
