Composing Finite State Transducers on GPUs

Arturo Argueta; David Chiang

arXiv:1805.06383·cs.CL·May 17, 2018

Composing Finite State Transducers on GPUs

Arturo Argueta, David Chiang

PDF

TL;DR

This paper presents the first GPU-based implementation of finite state transducer composition, achieving significant speedups over serial and CPU-based methods, thus enhancing efficiency in language processing tasks.

Contribution

The paper introduces a novel GPU implementation of FST composition and discusses optimizations for high-performance parallel processing.

Findings

01

Up to 6x speedup over serial implementation

02

Up to 4.5x speedup over OpenFST

03

Effective GPU optimizations for FST operations

Abstract

Weighted finite-state transducers (FSTs) are frequently used in language processing to handle tasks such as part-of-speech tagging and speech recognition. There has been previous work using multiple CPU cores to accelerate finite state algorithms, but limited attention has been given to parallel graphics processing unit (GPU) implementations. In this paper, we introduce the first (to our knowledge) GPU implementation of the FST composition operation, and we also discuss the optimizations used to achieve the best performance on this architecture. We show that our approach obtains speedups of up to 6x over our serial implementation and 4.5x over OpenFST.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.