Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear Microphone

Mattes Ohlenbusch; Christian Rollwage; Simon Doclo

arXiv:2409.04136·eess.AS·August 20, 2025·ICASSP

Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear Microphone

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo

PDF

Open Access

TL;DR

This paper introduces a low-complexity deep learning system for reconstructing a user's own voice in noisy environments using hearables, improving speech quality with limited device-specific data.

Contribution

It proposes low-complexity variants of an existing deep learning-based own voice reconstruction system tailored for hearables with limited resources.

Findings

01

Significant speech quality improvement demonstrated in simulations

02

Effective data augmentation reduces need for extensive device-specific recordings

03

Low-complexity models perform well under resource constraints

Abstract

Hearable devices, equipped with one or more microphones, are commonly used for speech communication. Here, we consider the scenario where a hearable is used to capture the user's own voice in a noisy environment. In this scenario, own voice reconstruction (OVR) is essential for enhancing the quality and intelligibility of the recorded noisy own voice signals. In previous work, we developed a deep learning-based OVR system, aiming to reduce the amount of device-specific recordings for training by using data augmentation with phoneme-dependent models of own voice transfer characteristics. Given the limited computational resources available on hearables, in this paper we propose low-complexity variants of an OVR system based on the FT-JNF architecture and investigate the required amount of device-specific recordings for effective data augmentation and fine-tuning. Simulation results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing