Proxyless Neural Architecture Adaptation for Supervised Learning and Self-Supervised Learning
Do-Guk Kim, Heung-Chang Lee

TL;DR
This paper introduces a proxyless neural architecture adaptation method that is reproducible, efficient, and applicable to both supervised and self-supervised learning, outperforming previous NAT methods on multiple datasets.
Contribution
The paper presents a novel proxyless neural architecture adaptation technique that improves reproducibility and efficiency over NAT, applicable to various learning paradigms and datasets.
Findings
Outperforms NAT on CIFAR-10 and Tiny ImageNet
Demonstrates stable performance across different architectures
Applicable to both supervised and self-supervised learning
Abstract
Recently, Neural Architecture Search (NAS) methods have been introduced and show impressive performance on many benchmarks. Among those NAS studies, Neural Architecture Transformer (NAT) aims to adapt the given neural architecture to improve performance while maintaining computational costs. However, NAT lacks reproducibility and it requires an additional architecture adaptation process before network weight training. In this paper, we propose proxyless neural architecture adaptation that is reproducible and efficient. Our method can be applied to both supervised learning and self-supervised learning. The proposed method shows stable performance on various architectures. Extensive reproducibility experiments on two datasets, i.e., CIFAR-10 and Tiny Imagenet, present that the proposed method definitely outperforms NAT and is applicable to other models and datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Adam · Byte Pair Encoding · Residual Connection · Label Smoothing · Position-Wise Feed-Forward Layer · Absolute Position Encodings
