The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection
Gabriel Bibb\'o, Thomas Deacon, Arshdeep Singh, Mark D. Plumbley

TL;DR
This paper introduces a new residential audio dataset with speech removed, designed to facilitate sound event detection research in smart homes for older adults, ensuring privacy and environmental fidelity.
Contribution
The paper presents a novel automated speech removal pipeline and a detailed residential audio dataset tailored for sound event detection in smart home environments.
Findings
The dataset accurately captures residential soundscapes with speech removed.
The speech removal pipeline effectively preserves non-speech sounds.
Analysis confirms the dataset's suitability for in-home sound event detection.
Abstract
This paper presents a residential audio dataset to support sound event detection research for smart home applications aimed at promoting wellbeing for older adults. The dataset is constructed by deploying audio recording systems in the homes of 8 participants aged 55-80 years for a 7-day period. Acoustic characteristics are documented through detailed floor plans and construction material information to enable replication of the recording environments for AI model deployment. A novel automated speech removal pipeline is developed, using pre-trained audio neural networks to detect and remove segments containing spoken voice, while preserving segments containing other sound events. The resulting dataset consists of privacy-compliant audio recordings that accurately capture the soundscapes and activities of daily living within residential spaces. The paper details the dataset creation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing
