Filter Distillation for Network Compression
Xavier Suau, Luca Zappella, Nicholas Apostoloff

TL;DR
This paper introduces Principal Filter Analysis (PFA), a practical method for neural network compression that leverages filter response correlations to reduce model size while maintaining accuracy across various architectures and datasets.
Contribution
The paper presents PFA, a novel, easy-to-use compression technique that adapts networks to specific domains and properties by analyzing filter response correlations.
Findings
Achieves up to 8x compression on VGG-16 with minimal accuracy loss.
Outperforms or matches state-of-the-art compression methods.
Demonstrates flexibility and practicality across multiple datasets and architectures.
Abstract
In this paper we introduce Principal Filter Analysis (PFA), an easy to use and effective method for neural network compression. PFA exploits the correlation between filter responses within network layers to recommend a smaller network that maintain as much as possible the accuracy of the full model. We propose two algorithms: the first allows users to target compression to specific network property, such as number of trainable variable (footprint), and produces a compressed model that satisfies the requested property while preserving the maximum amount of spectral energy in the responses of each layer, while the second is a parameter-free heuristic that selects the compression used at each layer by trying to mimic an ideal set of uncorrelated responses. Since PFA compresses networks based on the correlation of their responses we show in our experiments that it gains the additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
