FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

Yaman Umuroglu; Nicholas J. Fraser; Giulio Gambardella; Michaela; Blott; Philip Leong; Magnus Jahre; Kees Vissers

arXiv:1612.07119·cs.CV·December 22, 2016

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela, Blott, Philip Leong, Magnus Jahre, Kees Vissers

PDF

3 Repos

TL;DR

FINN is a flexible FPGA framework that efficiently maps binarized neural networks, achieving unprecedented classification speeds with low power consumption on standard embedded platforms.

Contribution

The paper introduces FINN, a novel FPGA-based framework for fast, scalable binarized neural network inference with customizable compute resources per layer.

Findings

01

Achieves up to 12.3 million classifications per second on MNIST

02

Demonstrates 21,906 classifications per second on CIFAR-10 and SVHN

03

Reports the fastest classification rates to date on these benchmarks

Abstract

Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values. In this paper, we present FINN, a framework for building fast and flexible FPGA accelerators using a flexible heterogeneous streaming architecture. By utilizing a novel set of optimizations that enable efficient mapping of binarized neural networks to hardware, we implement fully connected, convolutional and pooling layers, with per-layer compute resources being tailored to user-provided throughput requirements. On a ZC706 embedded FPGA platform drawing less than 25 W total system power, we demonstrate up to 12.3 million image classifications per second with 0.31 {\mu}s latency on the MNIST dataset with 95.8% accuracy, and 21906 image classifications per second with 283…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.