RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers
Xuwei Xu, Yang Li, Yudong Chen, Jiajun Liu, Sen Wang

TL;DR
RePaViT introduces a structural reparameterization method focusing on FFN layers to significantly accelerate Vision Transformers during inference, achieving up to 68.7% speed-up with minimal accuracy loss.
Contribution
This work presents the first application of structural reparameterization on FFN layers in ViTs, enabling substantial inference speed improvements.
Findings
RePaViT achieves up to 68.7% speed-up on large models.
The method maintains or improves accuracy despite acceleration.
Speed benefits scale with model size, with larger models gaining more.
Abstract
We reveal that feedforward network (FFN) layers, rather than attention layers, are the primary contributors to Vision Transformer (ViT) inference latency, with their impact signifying as model size increases. This finding highlights a critical opportunity for optimizing the efficiency of large-scale ViTs by focusing on FFN layers. In this work, we propose a novel channel idle mechanism that facilitates post-training structural reparameterization for efficient FFN layers during testing. Specifically, a set of feature channels remains idle and bypasses the nonlinear activation function in each FFN layer, thereby forming a linear pathway that enables structural reparameterization during inference. This mechanism results in a family of ReParameterizable Vision Transformers (RePaViTs), which achieve remarkable latency reductions with acceptable sacrifices (sometimes gains) in accuracy across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Infrared Target Detection Methodologies
