Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs
Chuanhao Zhuge, Xinheng Liu, Xiaofan Zhang, Sudeep Gummadi, Jinjun, Xiong, Deming Chen

TL;DR
This paper presents a hybrid convolution algorithm approach optimized for FPGA deployment, significantly accelerating face recognition tasks with improved latency and energy efficiency.
Contribution
It introduces a combined application of Winograd and FFT algorithms on FPGA, along with an optimization scheme for complex CNN architectures like Inception modules.
Findings
Achieves 3.75x latency speedup over high-end GPU
Surpasses previous FPGA face recognition results
Demonstrates effective FPGA acceleration for CNNs
Abstract
Deep Convolutional Neural Networks have become a Swiss knife in solving critical artificial intelligence tasks. However, deploying deep CNN models for latency-critical tasks remains to be challenging because of the complex nature of CNNs. Recently, FPGA has become a favorable device to accelerate deep CNNs thanks to its high parallel processing capability and energy efficiency. In this work, we explore different fast convolution algorithms including Winograd and Fast Fourier Transform (FFT), and find an optimal strategy to apply them together on different types of convolutions. We also propose an optimization scheme to exploit parallelism on novel CNN architectures such as Inception modules in GoogLeNet. We implement a configurable IP-based face recognition acceleration system based on FaceNet using High-Level Synthesis. Our implementation on a Xilinx Ultrascale device achieves 3.75x…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
Methods1x1 Convolution · Average Pooling · Local Response Normalization · Auxiliary Classifier · Inception Module · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax
