Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU
Mete Can Kaya, Alperen \.Inci, Alptekin Temizel

TL;DR
This paper presents an optimized GPU implementation of XNOR convolution for binary neural networks, significantly accelerating inference and enabling real-time deployment on embedded devices.
Contribution
It introduces a specialized GPU optimization for XNOR convolution in binary neural networks, improving inference speed and practicality for embedded systems.
Findings
GPU implementation achieves up to 42.61x speed-up
Effective for 3x3 kernel sizes
Enables real-time inference on embedded devices
Abstract
Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited capacity embedded devices. Once trained on less resource-constrained computational environments, they can be deployed for real-time inference on such devices. In this study, we propose an implementation of binary convolutional network inference on GPU by focusing on optimization of XNOR convolution. Experimental results show that using GPU can provide a speed-up of up to with a kernel size of . The implementation is publicly available at https://github.com/metcan/Binary-Convolutional-Neural-Network-Inference-on-GPU
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Brain Tumor Detection and Classification
