# Locale Encoding For Scalable Multilingual Keyword Spotting Models

**Authors:** Pai Zhu, Hyun Jin Park, Alex Park, Angelo Scorza Scarpati, Ignacio, Lopez Moreno

arXiv: 2302.12961 · 2023-02-28

## TL;DR

This paper introduces locale-conditioned universal models for multilingual keyword spotting, significantly improving accuracy and reducing false rejection rates across multiple languages and noise conditions.

## Contribution

It proposes two novel locale encoding methods, including FiLM, to enhance multilingual KWS performance over traditional monolingual and universal models.

## Key findings

- FiLM achieved 61% relative reduction in false rejection rate.
- Locale-conditioned models outperform baseline methods across all tested languages.
- Models maintain high accuracy in noisy environments.

## Abstract

A Multilingual Keyword Spotting (KWS) system detects spokenkeywords over multiple locales. Conventional monolingual KWSapproaches do not scale well to multilingual scenarios because ofhigh development/maintenance costs and lack of resource sharing.To overcome this limit, we propose two locale-conditioned universalmodels with locale feature concatenation and feature-wise linearmodulation (FiLM). We compare these models with two baselinemethods: locale-specific monolingual KWS, and a single universalmodel trained over all data. Experiments over 10 localized languagedatasets show that locale-conditioned models substantially improveaccuracy over baseline methods across all locales in different noiseconditions.FiLMperformed the best, improving on average FRRby 61% (relative) compared to monolingual KWS models of similarsizes.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12961/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12961/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/2302.12961/full.md

---
Source: https://tomesphere.com/paper/2302.12961