# Architecture Compression

**Authors:** Anubhav Ashok

arXiv: 1902.03326 · 2019-03-13

## TL;DR

This paper introduces Architecture Compression, a novel method that encodes neural network architectures into a continuous space for efficient compression, optimizing for accuracy and parameter count, demonstrated on multiple visual recognition datasets.

## Contribution

The paper presents a new architecture-based compression method using a learned embedding space and gradient-based optimization, differing from traditional weight or filter compression techniques.

## Key findings

- Achieves over 20x compression on CIFAR-10.
- Effective on multiple datasets including CIFAR-100, Fashion-MNIST, and SVHN.
- Outperforms traditional compression methods in architecture efficiency.

## Abstract

In this paper we propose a novel approach to model compression termed Architecture Compression. Instead of operating on the weight or filter space of the network like classical model compression methods, our approach operates on the architecture space. A 1-D CNN encoder-decoder is trained to learn a mapping from discrete architecture space to a continuous embedding and back. Additionally, this embedding is jointly trained to regress accuracy and parameter count in order to incorporate information about the architecture's effectiveness on the dataset. During the compression phase, we first encode the network and then perform gradient descent in continuous space to optimize a compression objective function that maximizes accuracy and minimizes parameter count. The final continuous feature is then mapped to a discrete architecture using the decoder. We demonstrate the merits of this approach on visual recognition tasks such as CIFAR-10, CIFAR-100, Fashion-MNIST and SVHN and achieve a greater than 20x compression on CIFAR-10.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.03326/full.md

## Figures

35 figures with captions in the complete paper: https://tomesphere.com/paper/1902.03326/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1902.03326/full.md

---
Source: https://tomesphere.com/paper/1902.03326