Efficient Dynamic WFST Decoding for Personalized Language Models

Jun Liu; Jiedan Zhu; Vishal Kathuria; Fuchun Peng

arXiv:1910.10670·cs.CL·October 24, 2019·6 cites

Efficient Dynamic WFST Decoding for Personalized Language Models

Jun Liu, Jiedan Zhu, Vishal Kathuria, Fuchun Peng

PDF

Open Access

TL;DR

This paper introduces a two-layer cache system for dynamic WFST decoding that significantly accelerates personalized speech recognition by sharing static components globally and personalized components privately.

Contribution

It presents a novel two-layer cache mechanism and pre-initialization methods that substantially improve decoding speed for personalized language models.

Findings

01

Public cache reduces decoding time by a factor of three.

02

Private cache further reduces decoding time by a factor of five.

03

Proposed methods outperform traditional decoding approaches.

Abstract

We propose a two-layer cache mechanism to speed up dynamic WFST decoding with personalized language models. The first layer is a public cache that stores most of the static part of the graph. This is shared globally among all users. A second layer is a private cache that caches the graph that represents the personalized language model, which is only shared by the utterances from a particular user. We also propose two simple yet effective pre-initialization methods, one based on breadth-first search, and another based on a data-driven exploration of decoder states using previous utterances. Experiments with a calling speech recognition task using a personalized contact list demonstrate that the proposed public cache reduces decoding time by factor of three compared to decoding without pre-initialization. Using the private cache provides additional efficiency gains, reducing the decoding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Speech Recognition and Synthesis · Natural Language Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings