An End-to-End Approach for Korean Wakeword Systems with Speaker   Authentication

Geonwoo Seo (Dongguk University)

arXiv:2501.12194·cs.SD·January 22, 2025

An End-to-End Approach for Korean Wakeword Systems with Speaker Authentication

Geonwoo Seo (Dongguk University)

PDF

Open Access 2 Repos

TL;DR

This paper presents an end-to-end Korean wakeword detection and voice authentication system that addresses language-specific challenges and privacy concerns, demonstrating promising accuracy and security in experimental results.

Contribution

It introduces a novel Korean wakeword training method combined with voice authentication using an open-source platform, enhancing privacy and language adaptability.

Findings

01

Achieved 16.79% EER in wakeword detection

02

Achieved 6.6% EER in voice authentication

03

Demonstrated effectiveness for Korean language applications

Abstract

Wakeword detection plays a critical role in enabling AI assistants to listen to user voices and interact effectively. However, for languages other than English, there is a significant lack of pre-trained wakeword models. Additionally, systems that merely determine the presence of a wakeword can pose serious privacy concerns. In this paper, we propose an end-to-end approach that trains wakewords for Non-English languages, particulary Korean, and uses this to develop a Voice Authentication model to protect user privacy. Our implementation employs an open-source platform OpenWakeWord, which performs wakeword detection using an FCN (Fully-Connected Network) architecture. Once a wakeword is detected, our custom-developed code calculates cosine similarity for robust user authentication. Experimental results demonstrate the effectiveness of our approach, achieving a 16.79% and a 6.6% Equal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems

MethodsMax Pooling · Convolution · Fully Convolutional Network