Masking Kernel for Learning Energy-Efficient Representations for Speaker   Recognition and Mobile Health

Apiwat Ditthapron; Emmanuel O. Agu; Adam C. Lammert

arXiv:2302.04161·eess.AS·August 16, 2023

Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health

Apiwat Ditthapron, Emmanuel O. Agu, Adam C. Lammert

PDF

Open Access 1 Repo

TL;DR

This paper introduces a masking kernel integrated into DNN training to optimize speech windowing parameters, significantly reducing energy consumption in smartphone-based speaker recognition and health assessment tasks.

Contribution

It proposes a novel masking kernel method that learns energy-efficient speech acquisition parameters during DNN training, addressing energy use in both data collection and inference.

Findings

01

Reduces overall energy consumption by 57%.

02

Achieves competitive performance in speaker recognition and health detection.

03

Optimizes speech windowing parameters for energy efficiency.

Abstract

Modern smartphones possess hardware for audio acquisition and to perform speech processing tasks such as speaker recognition and health assessment. However, energy consumption remains a concern, especially for resource-intensive DNNs. Prior work has improved the DNN energy efficiency by utilizing a compact model or reducing the dimensions of speech features. Both approaches reduced energy consumption during DNN inference but not during speech acquisition. This paper proposes using a masking kernel integrated into gradient descent during DNN training to learn the most energy-efficient speech length and sampling rate for windowing, a common step for sample construction. To determine the most energy-optimal parameters, a masking function with non-zero derivatives was combined with a low-pass filter. The proposed approach minimizes the energy consumption of both data collection and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aditthapron/windowmasking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing