Energy-Efficient Vision Transformer Inference for Edge-AI Deployment
Nursultan Amanzhol, Jurn-Gyu Park

TL;DR
This paper introduces a two-stage evaluation pipeline for energy-efficient Vision Transformer models tailored for edge-AI devices, combining model selection metrics with device-specific energy measurements, and benchmarks various models on NVIDIA hardware.
Contribution
It proposes a novel two-stage pipeline for assessing ViT energy efficiency that integrates device-agnostic and device-related metrics, and provides comprehensive benchmarking results.
Findings
Hybrid models like LeViT_Conv_192 reduce energy consumption by up to 53%.
Distilled models such as TinyViT-11M_Distilled perform well on mobile GPUs.
The pipeline effectively identifies energy-efficient ViT models for edge deployment.
Abstract
The growing deployment of Vision Transformers (ViTs) on energy-constrained devices requires evaluation methods that go beyond accuracy alone. We present a two-stage pipeline for assessing ViT energy efficiency that combines device-agnostic model selection with device-related measurements. We benchmark 13 ViT models on ImageNet-1K and CIFAR-10, running inference on NVIDIA Jetson TX2 (edge device) and an NVIDIA RTX 3050 (mobile GPU). The device-agnostic stage uses the NetScore metric for screening; the device-related stage ranks models with the Sustainable Accuracy Metric (SAM). Results show that hybrid models such as LeViT_Conv_192 reduce energy by up to 53% on TX2 relative to a ViT baseline (e.g., SAM5=1.44 on TX2/CIFAR-10), while distilled models such as TinyViT-11M_Distilled excel on the mobile GPU (e.g., SAM5=1.72 on RTX 3050/CIFAR-10 and SAM5=0.76 on RTX 3050/ImageNet-1K).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Green IT and Sustainability · Advanced Memory and Neural Computing
