Speaker and Posture Classification using Instantaneous Intraspeech   Breathing Features

At{\i}l \.Ilerialkan; Alptekin Temizel; H\"useyin Hac{\i}habibo\u{g}lu

arXiv:2005.12230·cs.SD·May 26, 2020

Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features

At{\i}l \.Ilerialkan, Alptekin Temizel, H\"useyin Hac{\i}habibo\u{g}lu

PDF

Open Access

TL;DR

This paper introduces a privacy-preserving method for speaker and posture classification based on intraspeech breathing sounds, utilizing Hilbert-Huang transform features and deep learning, achieving high accuracy on a new dataset.

Contribution

It presents a novel approach using intraspeech breathing sounds and HHT features for classification, addressing privacy concerns in speech-based identification.

Findings

01

87% speaker classification accuracy

02

98% posture classification accuracy

03

Effective use of intraspeech breathing sounds for classification

Abstract

Acoustic features extracted from speech are widely used in problems such as biometric speaker identification and first-person activity detection. However, the use of speech for such purposes raises privacy issues as the content is accessible to the processing party. In this work, we propose a method for speaker and posture classification using intraspeech breathing sounds. Instantaneous magnitude features are extracted using the Hilbert-Huang transform (HHT) and fed into a CNN-GRU network for classification of recordings from the open intraspeech breathing sound dataset, BreathBase, that we collected for this study. Using intraspeech breathing sounds, 87% speaker classification, and 98% posture classification accuracy were obtained.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing