SCRAM: Spatially Coherent Randomized Attention Maps

Dan A. Calian; Peter Roelants; Jacques Cali; Ben Carr; Krishna Dubba,; John E. Reid; Dell Zhang

arXiv:1905.10308·cs.LG·May 27, 2019·1 cites

SCRAM: Spatially Coherent Randomized Attention Maps

Dan A. Calian, Peter Roelants, Jacques Cali, Ben Carr, Krishna Dubba,, John E. Reid, Dell Zhang

PDF

Open Access

TL;DR

SCRAM is a fast randomized algorithm that approximates attention maps in Transformer models with O(n log(n)) complexity by exploiting spatial coherence and sparsity in images, enabling scalable deep learning applications.

Contribution

The paper introduces SCRAM, a novel randomized method that significantly accelerates attention map computation by leveraging spatial coherence and sparse structures in images.

Findings

01

SCRAM achieves O(n log(n)) complexity for attention map computation.

02

Preliminary results show SCRAM effectively speeds up attention in Transformer models.

03

SCRAM maintains accuracy while reducing computational cost.

Abstract

Attention mechanisms and non-local mean operations in general are key ingredients in many state-of-the-art deep learning techniques. In particular, the Transformer model based on multi-head self-attention has recently achieved great success in natural language processing and computer vision. However, the vanilla algorithm computing the Transformer of an image with n pixels has O(n^2) complexity, which is often painfully slow and sometimes prohibitively expensive for large-scale image data. In this paper, we propose a fast randomized algorithm --- SCRAM --- that only requires O(n log(n)) time to produce an image attention map. Such a dramatic acceleration is attributed to our insight that attention maps on real-world images usually exhibit (1) spatial coherence and (2) sparse structure. The central idea of SCRAM is to employ PatchMatch, a randomized correspondence algorithm, to quickly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax