Deep Model Compression Via Two-Stage Deep Reinforcement Learning

Huixin Zhan; Wei-Ming Lin; and Yongcan Cao

arXiv:1912.02254·cs.LG·July 5, 2021

Deep Model Compression Via Two-Stage Deep Reinforcement Learning

Huixin Zhan, Wei-Ming Lin, and Yongcan Cao

PDF

Open Access

TL;DR

This paper introduces a two-stage deep reinforcement learning approach for CNN model compression, combining pruning and quantization to significantly reduce model size while maintaining or improving accuracy.

Contribution

It proposes a novel DRL-based framework for jointly optimizing pruning and quantization in a two-stage pipeline for CNN compression.

Findings

01

Achieved 9x size reduction on CIFAR-10 with slight accuracy gain.

02

Reduced VGG-16 size by 33x on ImageNet with no accuracy loss.

03

Demonstrated effectiveness on CIFAR-10 and ImageNet datasets.

Abstract

Besides accuracy, the model size of convolutional neural networks (CNN) models is another important factor considering limited hardware resources in practical applications. For example, employing deep neural networks on mobile systems requires the design of accurate yet fast CNN for low latency in classification and object detection. To fulfill the need, we aim at obtaining CNN models with both high testing accuracy and small size to address resource constraints in many embedded devices. In particular, this paper focuses on proposing a generic reinforcement learning-based model compression approach in a two-stage compression pipeline: pruning and quantization. The first stage of compression, i.e., pruning, is achieved via exploiting deep reinforcement learning (DRL) to co-learn the accuracy and the FLOPs updated after layer-wise channel pruning and element-wise variational pruning via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsPruning