Efficient Few-Shot Learning for Edge AI via Knowledge Distillation on MobileViT
Shuhei Tsuyuki, Reda Bensaid, J\'er\'emy Morlier, Mathieu L\'eonardon, Naoya Onizawa, Vincent Gripon, Takahiro Hanyu

TL;DR
This paper introduces a knowledge distillation approach for MobileViT to enhance few-shot learning on edge devices, achieving higher accuracy with significantly reduced model size and energy consumption.
Contribution
It presents a novel pre-training method using knowledge distillation for MobileViT, improving few-shot classification accuracy and efficiency on edge hardware.
Findings
Achieved 14% and 6.7% accuracy improvements for one-shot and five-shot tasks.
Reduced model parameters by 69% and FLOPs by 88%.
Lowered power consumption by 37% on Jetson Orin Nano.
Abstract
Efficient and adaptable deep learning models are an important area of deep learning research, driven by the need for highly efficient models on edge devices. Few-shot learning enables the use of deep learning models in low-data regimes, a capability that is highly sought after in real-world applications where collecting large annotated datasets is costly or impractical. This challenge is particularly relevant in edge scenarios, where connectivity may be limited, low-latency responses are required, or energy consumption constraints are critical. We propose and evaluate a pre-training method for the MobileViT backbone designed for edge computing. Specifically, we employ knowledge distillation, which transfers the generalization ability of a large-scale teacher model to a lightweight student model. This method achieves accuracy improvements of 14% and 6.7% for one-shot and five-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
