GPU-Accelerated Forward-Backward algorithm with Application to Lattice-Free MMI
Lucas Ondel, L\'ea-Marie Lam-Yee-Mui, Martin Kocour, Caio Filippo, Corro, Luk\'a\v{s} Burget

TL;DR
This paper introduces a GPU-optimized forward-backward algorithm expressed through sparse matrix operations in a semiring, enabling faster training of TDNNs with LF-MMI without approximations.
Contribution
It presents a novel GPU-friendly formulation of the forward-backward algorithm using semiring algebra, improving training speed for LF-MMI-based models.
Findings
Implementation is about twice as fast as existing C++/CUDA systems.
No need for approximations like leaky-HMM in training.
Easily implemented in Julia or similar languages.
Abstract
We propose to express the forward-backward algorithm in terms of operations between sparse matrices in a specific semiring. This new perspective naturally leads to a GPU-friendly algorithm which is easy to implement in Julia or any programming languages with native support of semiring algebra. We use this new implementation to train a TDNN with the LF-MMI objective function and we compare the training time of our system with PyChain - a recently introduced C++/CUDA implementation of the LF-MMI loss. Our implementation is about two times faster while not having to use any approximation such as the "leaky-HMM".
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Matrix Theory and Algorithms · Model Reduction and Neural Networks
