An Approximation Algorithm for Optimal Subarchitecture Extraction

Adrian de Wynter

arXiv:2010.08512·cs.LG·October 19, 2020·1 cites

An Approximation Algorithm for Optimal Subarchitecture Extraction

Adrian de Wynter

PDF

Open Access 2 Repos

TL;DR

This paper introduces an approximation algorithm for selecting optimal neural network architectures based on size, speed, and accuracy, providing near-optimal solutions efficiently for many instances.

Contribution

It presents a novel approximation algorithm that behaves like an FPTAS for a broad class of instances in neural architecture optimization.

Findings

01

Algorithm achieves approximation error |1 - | for many instances.

02

Runs in polynomial time with respect to input parameters.

03

Provides a formal framework for optimal subarchitecture extraction.

Abstract

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $ρ \leq ∣ 1 - ϵ ∣$ , and that runs in $O (∣ Ξ ∣ + ∣ W_{T}^{*} ∣ (1 + ∣ Θ ∣∣ B ∣∣ Ξ ∣/ (ϵ s^{3/2})))$ steps, where $ϵ$ and $s$ are input parameters; $∣ B ∣$ is the batch size; $∣ W_{T}^{*} ∣$ denotes the cardinality of the largest weight set assignment; and $∣ Ξ ∣$ and $∣ Θ ∣$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning