Parallel Scheduling Self-attention Mechanism: Generalization and Optimization
Mingfei Yu, Masahiro Fujita

TL;DR
This paper introduces a general parallel scheduling algorithm for self-attention mechanisms in deep learning, optimizing computation distribution across architectures with multiple units, and demonstrates significant reductions in redundant calculations through experimental validation.
Contribution
It proposes a novel scheduling algorithm derived from SAT solver solutions for parallelizing self-attention computations, including optimization strategies for reducing redundant calculations.
Findings
Achieves up to 50% reduction in redundant computations.
Provides algorithms applicable to various problem sizes with divisibility constraints.
Experimental results validate the correctness and efficiency of the proposed methods.
Abstract
Over the past few years, self-attention is shining in the field of deep learning, especially in the domain of natural language processing(NLP). Its impressive effectiveness, along with ubiquitous implementations, have aroused our interest in efficiently scheduling the data-flow of corresponding computations onto architectures with many computing units to realize parallel computing. In this paper, based on the theory of self-attention mechanism and state-of-the-art realization of self-attention in language models, we propose a general scheduling algorithm, which is derived from the optimum scheduling for small instances solved by a satisfiability checking(SAT) solver, to parallelize typical computations of self-attention. Strategies for further optimization on skipping redundant computations are put forward as well, with which reductions of almost 25% and 50% of the original computations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
