FLuRKA: Fast and accurate unified Low-Rank & Kernel Attention
Ahan Gupta, Hao Guo, Yueming Yuan, Yanqi Zhou, Charith Mendis

TL;DR
FLuRKA introduces a unified low-rank and kernel attention mechanism that significantly accelerates transformer training and inference while maintaining or improving accuracy across diverse tasks.
Contribution
The paper proposes FLuRKA, a novel fusion of low-rank and kernel attention methods, offering a training-efficient transformer with theoretical and empirical speed and quality guarantees.
Findings
FLuRKA achieves up to 3.3x speedup over low-rank methods.
FLuRKA attains up to 20x speedup over flash-attention models.
FLuRKA maintains or surpasses the accuracy of constituent methods across various tasks.
Abstract
Many efficient self-attention techniques have become prevalent since the inception of the transformer architecture. Two popular classes of these techniques are low-rank and kernel methods. Each of these methods has its strengths. We observe these strengths synergistically complement each other and exploit them to fuse low-rank and kernel methods, producing a new class of transformers: FLuRKA (ast ow-ank & ernelttention). FLuRKA are highly with faster model speeds similar model qualities compared to constituent low-rank and kernel methods. We theoretically and empirically evaluate the speed and quality of FLuRKA. Our model speed analysis posits a variety of parameter configurations where FLuRKA exhibit speedups over low-rank and kernel approximations and our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
