PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference

Abhishek Moitra; Abhiroop Bhattacharjee; Priyadarshini Panda

arXiv:2404.15185·cs.AR·April 24, 2024

PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference

Abhishek Moitra, Abhiroop Bhattacharjee, Priyadarshini Panda

PDF

1 Repo

TL;DR

PIVOT is a framework that dynamically skips attention in vision transformers based on input complexity, significantly reducing energy-delay product while maintaining accuracy.

Contribution

It introduces a hardware-in-loop co-search method for input-aware attention skipping in ViTs, optimizing delay-accuracy tradeoffs.

Findings

01

2.7x lower EDP on FPGA with minimal accuracy loss

02

1.3x higher accuracy on CPU compared to prior work

03

1.8x higher throughput on GPU

Abstract

The attention module in vision transformers(ViTs) performs intricate spatial correlations, contributing significantly to accuracy and delay. It is thereby important to modulate the number of attentions according to the input feature complexity for optimal delay-accuracy tradeoffs. To this end, we propose PIVOT - a co-optimization framework which selectively performs attention skipping based on the input difficulty. For this, PIVOT employs a hardware-in-loop co-search to obtain optimal attention skip configurations. Evaluations on the ZCU102 MPSoC FPGA show that PIVOT achieves 2.7x lower EDP at 0.2% accuracy reduction compared to LVViT-S ViT. PIVOT also achieves 1.3% and 1.8x higher accuracy and throughput than prior works on traditional CPUs and GPUs. The PIVOT project can be found at https://github.com/Intelligent-Computing-Lab-Yale/PIVOT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

intelligent-computing-lab-yale/pivot
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.