Information-Theoretic Understanding of Population Risk Improvement with   Model Compression

Yuheng Bu; Weihao Gao; Shaofeng Zou; Venugopal V. Veeravalli

arXiv:1901.09421·stat.ML·January 29, 2019·1 cites

Information-Theoretic Understanding of Population Risk Improvement with Model Compression

Yuheng Bu, Weihao Gao, Shaofeng Zou, Venugopal V. Veeravalli

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that model compression can enhance population risk by balancing reduced generalization error against increased empirical risk, supported by theoretical analysis and neural network experiments.

Contribution

It provides an information-theoretic framework showing how model compression acts as regularization to improve population risk, with practical insights for neural network compression.

Findings

01

Model compression reduces an information-theoretic bound on generalization error.

02

Population risk can be improved if generalization error decrease outweighs empirical risk increase.

03

Regularizing clustering centers enhances Hessian-weighted K-means compression.

Abstract

We show that model compression can improve the population risk of a pre-trained model, by studying the tradeoff between the decrease in the generalization error and the increase in the empirical risk with model compression. We first prove that model compression reduces an information-theoretic bound on the generalization error; this allows for an interpretation of model compression as a regularization technique to avoid overfitting. We then characterize the increase in empirical risk with model compression using rate distortion theory. These results imply that the population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. We show through a linear regression example that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wgao9/weight_quant
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Model Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis