Speech-dependent Data Augmentation for Own Voice Reconstruction with Hearable Microphones in Noisy Environments

Mattes Ohlenbusch; Christian Rollwage; Simon Doclo

arXiv:2405.11592·eess.AS·August 20, 2025·EURASIP J. Audio Speech Music. Process.

Speech-dependent Data Augmentation for Own Voice Reconstruction with Hearable Microphones in Noisy Environments

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo

PDF

TL;DR

This paper introduces speech-dependent data augmentation methods for training own voice reconstruction systems in noisy environments, improving performance by simulating additional voice signals based on transfer characteristics.

Contribution

The paper presents novel speech-dependent augmentation techniques that estimate transfer functions from limited data to generate more training samples for voice reconstruction.

Findings

01

Speech-dependent augmentation outperforms other methods.

02

Fine-tuning further enhances reconstruction quality.

03

Transfer characteristics enable realistic voice simulation.

Abstract

Own voice pickup for hearables in noisy environments benefits from using both an outer and an in-ear microphone outside and inside the occluded ear. Due to environmental noise recorded at both microphones, and amplification of the own voice at low frequencies and band-limitation at the in-ear microphone, an own voice reconstruction system is needed to enable communication. A large amount of own voice signals is required to train a supervised deep learning-based own voice reconstruction system. Training data can either be obtained by recording a large amount of own voice signals of different talkers with a specific device, which is costly, or through augmentation of available speech data. Own voice signals can be simulated by assuming a linear time-invariant relative transfer function between hearable microphones for each phoneme, referred to as own voice transfer characteristics. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.