UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition
Kartik Hegde, Jiyong Yu, Rohit Agrawal, Mengjia Yan, Michael Pellauer,, Christopher W. Fletcher

TL;DR
This paper introduces UCNN, a CNN accelerator that leverages weight repetition to enhance energy efficiency and performance, outperforming sparsity-based methods with minimal area overhead.
Contribution
The paper proposes UCNN, a novel CNN accelerator that exploits weight repetition for computation reuse and model size reduction, extending beyond traditional sparsity-based optimizations.
Findings
UCNN improves energy efficiency by 1.2x to 4x on three CNNs.
UCNN achieves these gains with only 17-24% area overhead.
The approach generalizes sparsity optimization to weight repetition exploitation.
Abstract
Convolutional Neural Networks (CNNs) have begun to permeate all corners of electronic society (from voice recognition to scene generation) due to their high accuracy and machine efficiency per operation. At their core, CNN computations are made up of multi-dimensional dot products between weight and input vectors. This paper studies how weight repetition ---when the same weight occurs multiple times in or across weight vectors--- can be exploited to save energy and improve performance during CNN inference. This generalizes a popular line of work to improve efficiency from CNN weight sparsity, as reducing computation due to repeated zero weights is a special case of reducing computation due to repeated weights. To exploit weight repetition, this paper proposes a new CNN accelerator called the Unique Weight CNN Accelerator (UCNN). UCNN uses weight repetition to reuse CNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
