Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Minchul Kim, Shangqian Gao, Yen-Chang Hsu, Yilin Shen, Hongxia Jin

TL;DR
Token Fusion (ToFu) combines token pruning and merging techniques to reduce computational costs in Vision Transformers, maintaining accuracy and enabling efficient deployment on resource-limited devices.
Contribution
The paper introduces Token Fusion, a novel scheme that integrates token pruning and merging, along with MLERP merging to preserve feature norms, enhancing efficiency without retraining.
Findings
ToFu improves computational efficiency in ViTs.
ToFu maintains or enhances model accuracy.
ToFu sets new benchmarks in classification and image generation.
Abstract
Vision Transformers (ViTs) have emerged as powerful backbones in computer vision, outperforming many traditional CNNs. However, their computational overhead, largely attributed to the self-attention mechanism, makes deployment on resource-constrained edge devices challenging. Multiple solutions rely on token pruning or token merging. In this paper, we introduce "Token Fusion" (ToFu), a method that amalgamates the benefits of both token pruning and token merging. Token pruning proves advantageous when the model exhibits sensitivity to input interpolations, while token merging is effective when the model manifests close to linear responses to inputs. We combine this to propose a new scheme called Token Fusion. Moreover, we tackle the limitations of average merging, which doesn't preserve the intrinsic feature norm, resulting in distributional shifts. To mitigate this, we introduce MLERP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Token Fusion: Bridging the Gap Between Token Pruning and Token Merging· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsPruning · Tofu
