Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features
At{\i}l \.Ilerialkan, Alptekin Temizel, H\"useyin Hac{\i}habibo\u{g}lu

TL;DR
This paper introduces a privacy-preserving method for speaker and posture classification based on intraspeech breathing sounds, utilizing Hilbert-Huang transform features and deep learning, achieving high accuracy on a new dataset.
Contribution
It presents a novel approach using intraspeech breathing sounds and HHT features for classification, addressing privacy concerns in speech-based identification.
Findings
87% speaker classification accuracy
98% posture classification accuracy
Effective use of intraspeech breathing sounds for classification
Abstract
Acoustic features extracted from speech are widely used in problems such as biometric speaker identification and first-person activity detection. However, the use of speech for such purposes raises privacy issues as the content is accessible to the processing party. In this work, we propose a method for speaker and posture classification using intraspeech breathing sounds. Instantaneous magnitude features are extracted using the Hilbert-Huang transform (HHT) and fed into a CNN-GRU network for classification of recordings from the open intraspeech breathing sound dataset, BreathBase, that we collected for this study. Using intraspeech breathing sounds, 87% speaker classification, and 98% posture classification accuracy were obtained.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
