N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning
Anubhav Ashok, Nicholas Rhinehart, Fares Beainy, Kris M. Kitani

TL;DR
This paper introduces a reinforcement learning-based method to automatically compress large neural networks into smaller, efficient models while maintaining performance, using a data-driven approach with policy gradients.
Contribution
It presents a novel reinforcement learning framework that learns to generate compressed neural network architectures from a larger teacher network.
Findings
Achieves over 10x compression on ResNet-34 with maintained accuracy.
Uses policy gradients to optimize network architecture reduction policies.
Pre-trained policies on smaller networks accelerate training on larger networks.
Abstract
While bigger and deeper neural network architectures continue to advance the state-of-the-art for many computer vision tasks, real-world adoption of these networks is impeded by hardware and speed constraints. Conventional model compression methods attempt to address this problem by modifying the architecture manually or using pre-defined heuristics. Since the space of all reduced architectures is very large, modifying the architecture of a deep neural network in this way is a difficult task. In this paper, we tackle this issue by introducing a principled method for learning reduced network architectures in a data-driven way using reinforcement learning. Our approach takes a larger `teacher' network as input and outputs a compressed `student' network derived from the `teacher' network. In the first stage of our method, a recurrent policy network aggressively removes layers from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
