TL;DR
CoopNet introduces a cooperative approach combining quantization and binarization in CNNs, optimized for low-power MCUs, resulting in lower latency and higher accuracy compared to separate methods.
Contribution
This work demonstrates the effectiveness of jointly optimizing quantization and binarization in CNNs for low-power micro-controllers, improving performance over separate optimizations.
Findings
Substantial improvements in latency and accuracy over separate optimization methods.
Validated on three CNNs using ARM Cortex-M low-power cores.
Effective deployment on MCUs with limited memory and computational resources.
Abstract
Fixed-point quantization and binarization are two reduction methods adopted to deploy Convolutional Neural Networks (CNN) on end-nodes powered by low-power micro-controller units (MCUs). While most of the existing works use them as stand-alone optimizations, this work aims at demonstrating there is margin for a joint cooperation that leads to inferential engines with lower latency and higher accuracy. Called CoopNet, the proposed heterogeneous model is conceived, implemented and tested on off-the-shelf MCUs with small on-chip memory and few computational resources. Experimental results conducted on three different CNNs using as test-bench the low-power RISC core of the Cortex-M family by ARM validate the CoopNet proposal by showing substantial improvements w.r.t. designs where quantization and binarization are applied separately.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
