PQK: Model Compression via Pruning, Quantization, and Knowledge   Distillation

Jangho Kim; Simyung Chang; Nojun Kwak

arXiv:2106.14681·cs.LG·June 29, 2021

PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

Jangho Kim, Simyung Chang, Nojun Kwak

PDF

TL;DR

This paper introduces PQK, a novel model compression technique combining pruning, quantization, and knowledge distillation that creates an efficient, lightweight DNN suitable for edge devices without requiring pre-trained teacher models.

Contribution

PQK uniquely integrates pruning, quantization, and in-network knowledge distillation to produce compact models without pre-training a teacher network.

Findings

01

Effective in keyword spotting tasks

02

Reduces model size and computational cost

03

Maintains high accuracy on image recognition

Abstract

As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · Knowledge Distillation