Efficient Few-Shot Learning for Edge AI via Knowledge Distillation on MobileViT

Shuhei Tsuyuki; Reda Bensaid; J\'er\'emy Morlier; Mathieu L\'eonardon; Naoya Onizawa; Vincent Gripon; Takahiro Hanyu

arXiv:2603.26145·cs.CV·March 30, 2026

Efficient Few-Shot Learning for Edge AI via Knowledge Distillation on MobileViT

Shuhei Tsuyuki, Reda Bensaid, J\'er\'emy Morlier, Mathieu L\'eonardon, Naoya Onizawa, Vincent Gripon, Takahiro Hanyu

PDF

TL;DR

This paper introduces a knowledge distillation approach for MobileViT to enhance few-shot learning on edge devices, achieving higher accuracy with significantly reduced model size and energy consumption.

Contribution

It presents a novel pre-training method using knowledge distillation for MobileViT, improving few-shot classification accuracy and efficiency on edge hardware.

Findings

01

Achieved 14% and 6.7% accuracy improvements for one-shot and five-shot tasks.

02

Reduced model parameters by 69% and FLOPs by 88%.

03

Lowered power consumption by 37% on Jetson Orin Nano.

Abstract

Efficient and adaptable deep learning models are an important area of deep learning research, driven by the need for highly efficient models on edge devices. Few-shot learning enables the use of deep learning models in low-data regimes, a capability that is highly sought after in real-world applications where collecting large annotated datasets is costly or impractical. This challenge is particularly relevant in edge scenarios, where connectivity may be limited, low-latency responses are required, or energy consumption constraints are critical. We propose and evaluate a pre-training method for the MobileViT backbone designed for edge computing. Specifically, we employ knowledge distillation, which transfers the generalization ability of a large-scale teacher model to a lightweight student model. This method achieves accuracy improvements of 14% and 6.7% for one-shot and five-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.