AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han

TL;DR
This paper introduces AMC, an AutoML approach using reinforcement learning to automate model compression for mobile devices, achieving higher compression ratios and accuracy preservation compared to traditional methods.
Contribution
The paper presents a novel reinforcement learning-based AutoML framework for model compression that outperforms hand-crafted policies in efficiency and accuracy.
Findings
Achieved 2.7% better accuracy with 4x FLOPs reduction on VGG-16.
Realized 1.81x inference speedup on Android with minimal accuracy loss.
Automated pipeline reduces human effort in model compression tasks.
Abstract
Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted heuristics and rule-based policies that require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverage reinforcement learning to provide the model compression policy. This learning-based compression policy outperforms conventional rule-based compression policy by having higher compression ratio, better preserving the accuracy and freeing human labor. Under 4x FLOPs reduction, we achieved 2.7% better accuracy than the handcrafted model compression policy for VGG-16 on ImageNet. We applied…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms
