AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He; Ji Lin; Zhijian Liu; Hanrui Wang; Li-Jia Li; and Song Han

arXiv:1802.03494·cs.CV·April 5, 2024·366 cites

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han

PDF

Open Access 5 Repos

TL;DR

This paper introduces AMC, an AutoML approach using reinforcement learning to automate model compression for mobile devices, achieving higher compression ratios and accuracy preservation compared to traditional methods.

Contribution

The paper presents a novel reinforcement learning-based AutoML framework for model compression that outperforms hand-crafted policies in efficiency and accuracy.

Findings

01

Achieved 2.7% better accuracy with 4x FLOPs reduction on VGG-16.

02

Realized 1.81x inference speedup on Android with minimal accuracy loss.

03

Automated pipeline reduces human effort in model compression tasks.

Abstract

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted heuristics and rule-based policies that require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverage reinforcement learning to provide the model compression policy. This learning-based compression policy outperforms conventional rule-based compression policy by having higher compression ratio, better preserving the accuracy and freeing human labor. Under 4x FLOPs reduction, we achieved 2.7% better accuracy than the handcrafted model compression policy for VGG-16 on ImageNet. We applied…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms