Knowledge Transfer for Efficient On-device False Trigger Mitigation

Pranay Dighe; Erik Marchi; Srikanth Vishnubhotla; Sachin Kajarekar,; Devang Naik

arXiv:2010.10591·eess.AS·October 22, 2020

Knowledge Transfer for Efficient On-device False Trigger Mitigation

Pranay Dighe, Erik Marchi, Srikanth Vishnubhotla, Sachin Kajarekar,, Devang Naik

PDF

TL;DR

This paper introduces a lightweight LSTM-based model for on-device false trigger mitigation in voice assistants, which directly analyzes acoustic features to identify false triggers efficiently without transcribing audio, achieving high mitigation rates.

Contribution

It presents a novel LSTM-based architecture trained via knowledge transfer from a graph neural network to detect false triggers without ASR transcripts, suitable for limited-resource devices.

Findings

01

Mitigates 87% of false triggers at 99% TPR

02

Operates effectively with only 1.69 seconds of audio in streaming scenarios

03

Models are small footprint and suitable for on-device deployment

Abstract

In this paper, we address the task of determining whether a given utterance is directed towards a voice-enabled smart-assistant device or not. An undirected utterance is termed as a "false trigger" and false trigger mitigation (FTM) is essential for designing a privacy-centric non-intrusive smart assistant. The directedness of an utterance can be identified by running automatic speech recognition (ASR) on it and determining the user intent by analyzing the ASR transcript. But in case of a false trigger, transcribing the audio using ASR itself is strongly undesirable. To alleviate this issue, we propose an LSTM-based FTM architecture which determines the user intent from acoustic features directly without explicitly generating ASR transcripts from the audio. The proposed models are small footprint and can be run on-device with limited computational resources. During training, the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGraph Neural Network