HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices

Lotfi Abdelkrim Mecharbat; Hadjer Benmeziane; Hamza Ouarnoughi and; Smail Niar

arXiv:2303.04440·cs.CV·March 29, 2023·1 cites

HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices

Lotfi Abdelkrim Mecharbat, Hadjer Benmeziane, Hamza Ouarnoughi and, Smail Niar

PDF

Open Access

TL;DR

HyT-NAS introduces a hardware-aware neural architecture search method that designs efficient hybrid transformer-convolution models optimized for resource-constrained edge devices, achieving high accuracy with fewer parameters.

Contribution

It enhances HW-NAS by expanding the search space and improving strategies, enabling the discovery of effective hybrid architectures for vision tasks on tiny devices.

Findings

01

Achieves similar hypervolume with less than 5x training evaluations

02

Outperforms MobileNetV1 by 6.3% accuracy on Visual Wake Words

03

Uses 3.5x fewer parameters than baseline models

Abstract

Vision Transformers have enabled recent attention-based Deep Learning (DL) architectures to achieve remarkable results in Computer Vision (CV) tasks. However, due to the extensive computational resources required, these architectures are rarely implemented on resource-constrained platforms. Current research investigates hybrid handcrafted convolution-based and attention-based models for CV tasks such as image classification and object detection. In this paper, we propose HyT-NAS, an efficient Hardware-aware Neural Architecture Search (HW-NAS) including hybrid architectures targeting vision tasks on tiny devices. HyT-NAS improves state-of-the-art HW-NAS by enriching the search space and enhancing the search strategy as well as the performance predictors. Our experiments show that HyT-NAS achieves a similar hypervolume with less than ~5x training evaluations. Our resulting architecture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Visual Attention and Saliency Detection

MethodsPointwise Convolution · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Global Average Pooling · Softmax · Dense Connections · Depthwise Convolution · Convolution