HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices
Lotfi Abdelkrim Mecharbat, Hadjer Benmeziane, Hamza Ouarnoughi and, Smail Niar

TL;DR
HyT-NAS introduces a hardware-aware neural architecture search method that designs efficient hybrid transformer-convolution models optimized for resource-constrained edge devices, achieving high accuracy with fewer parameters.
Contribution
It enhances HW-NAS by expanding the search space and improving strategies, enabling the discovery of effective hybrid architectures for vision tasks on tiny devices.
Findings
Achieves similar hypervolume with less than 5x training evaluations
Outperforms MobileNetV1 by 6.3% accuracy on Visual Wake Words
Uses 3.5x fewer parameters than baseline models
Abstract
Vision Transformers have enabled recent attention-based Deep Learning (DL) architectures to achieve remarkable results in Computer Vision (CV) tasks. However, due to the extensive computational resources required, these architectures are rarely implemented on resource-constrained platforms. Current research investigates hybrid handcrafted convolution-based and attention-based models for CV tasks such as image classification and object detection. In this paper, we propose HyT-NAS, an efficient Hardware-aware Neural Architecture Search (HW-NAS) including hybrid architectures targeting vision tasks on tiny devices. HyT-NAS improves state-of-the-art HW-NAS by enriching the search space and enhancing the search strategy as well as the performance predictors. Our experiments show that HyT-NAS achieves a similar hypervolume with less than ~5x training evaluations. Our resulting architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Visual Attention and Saliency Detection
MethodsPointwise Convolution · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Global Average Pooling · Softmax · Dense Connections · Depthwise Convolution · Convolution
