CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy   Architecture

Mehdi Ahmadi; Shervin Vakili; J.M. Pierre Langlois

arXiv:2010.00627·cs.AR·October 5, 2020

CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture

Mehdi Ahmadi, Shervin Vakili, J.M. Pierre Langlois

PDF

TL;DR

This paper introduces a reconfigurable, low-energy CNN accelerator architecture that efficiently handles diverse convolutional layer structures, minimizes data movement, and maximizes resource utilization, demonstrated on VGGNet-16 and ResNet-50.

Contribution

It proposes a novel energy-efficient CNN accelerator architecture with optimized dataflows for structural diversity, achieving high PE utilization and reduced latency.

Findings

01

Achieves 98% PE utilization on most convolutional layers.

02

Limits latency to under 400 ms for VGGNet-16 and ResNet-50.

03

Reduces latency to 42.5 ms by exploiting sparsity in ResNet-50.

Abstract

Convolutional Neural Networks (CNNs) have proven to be extremely accurate for image recognition, even outperforming human recognition capability. When deployed on battery-powered mobile devices, efficient computer architectures are required to enable fast and energy-efficient computation of costly convolution operations. Despite recent advances in hardware accelerator design for CNNs, two major problems have not yet been addressed effectively, particularly when the convolution layers have highly diverse structures: (1) minimizing energy-hungry off-chip DRAM data movements; (2) maximizing the utilization factor of processing resources to perform convolutions. This work thus proposes an energy-efficient architecture equipped with several optimized dataflows to support the structural diversity of modern CNNs. The proposed approach is evaluated by implementing convolutional layers of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.