Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Hayeon Lee; Sohyun An; Minseon Kim; Sung Ju Hwang

arXiv:2305.16948·cs.LG·May 29, 2023·2 cites

Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Hayeon Lee, Sohyun An, Minseon Kim, Sung Ju Hwang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DaSS, a meta-prediction model that efficiently predicts the performance of neural architectures during knowledge distillation on unseen datasets, reducing the need for costly training in NAS.

Contribution

The paper presents a novel distillation-aware meta accuracy prediction model that generalizes to unseen datasets, improving efficiency and performance in NAS with knowledge distillation.

Findings

01

DaSS outperforms existing meta-NAS methods on unseen datasets.

02

The model accurately predicts architecture performance without training.

03

Significant reduction in computational cost for DaNAS tasks.

Abstract

Distillation-aware Neural Architecture Search (DaNAS) aims to search for an optimal student architecture that obtains the best performance and/or efficiency when distilling the knowledge from a given teacher model. Previous DaNAS methods have mostly tackled the search for the neural architecture for fixed datasets and the teacher, which are not generalized well on a new task consisting of an unseen dataset and an unseen teacher, thus need to perform a costly search for any new combination of the datasets and the teachers. For standard NAS tasks without KD, meta-learning-based computationally efficient NAS methods have been proposed, which learn the generalized search process over multiple tasks (datasets) and transfer the knowledge obtained over those tasks to a new task. However, since they assume learning from scratch without KD from a teacher, they might not be ideal for DaNAS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cownowan/dass
pytorchOfficial

Videos

Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning