A Computing Kernel for Network Binarization on PyTorch

Xianda Xu; Marco Pedersoli

arXiv:1911.04477·cs.LG·November 13, 2019

A Computing Kernel for Network Binarization on PyTorch

Xianda Xu, Marco Pedersoli

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new computing kernel for network binarization in PyTorch, significantly accelerating inference on GPU and CPU, and facilitating deployment of efficient neural networks on low-power devices.

Contribution

It develops the first PyTorch computing kernel supporting 1-bit xnor and bitcount operations for network binarization.

Findings

01

3x inference acceleration on GPU

02

4.5x inference acceleration on CPU

03

Enables efficient deployment of binarized neural networks

Abstract

Deep Neural Networks have now achieved state-of-the-art results in a wide range of tasks including image classification, object detection and so on. However, they are both computation consuming and memory intensive, making them difficult to deploy on low-power devices. Network binarization is one of the existing effective techniques for model compression and acceleration, but there is no computing kernel yet to support it on PyTorch. In this paper we developed a computing kernel supporting 1-bit xnor and bitcount computation on PyTorch. Experimental results show that our kernel could accelerate the inference of the binarized neural network by 3 times in GPU and by 4.5 times in CPU compared with the control group.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brycexu/BNN_Kernel
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning