Direction of Arrival Correction through Speech Quality Feedback
Caleb Rascon

TL;DR
This paper introduces a real-time DOA correction method for speech enhancement that uses speech quality feedback to improve direction estimation accuracy, addressing errors in initial DOA estimates.
Contribution
It proposes a novel real-time DOA correction scheme utilizing speech quality feedback and an Adam-based optimization loop, improving accuracy despite high variability in quality estimates.
Findings
Corrects up to 15° DOA errors in real-time
Uses speech quality as a feedback variable
Provides insights for faster convergence and reduced variability
Abstract
Real-time speech enhancement has began to rise in performance, and the Demucs Denoiser model has recently demonstrated strong performance in multiple-speech-source scenarios when accompanied by a location-based speech target selection strategy. However, it has shown to be sensitive to errors in the direction-of-arrival (DOA) estimation. In this work, a DOA correction scheme is proposed that uses the real-time estimated speech quality of its enhanced output as the observed variable in an Adam-based optimization feedback loop to find the correct DOA. In spite of the high variability of the speech quality estimation, the proposed system is able to correct in real-time an error of up to 15 using only the speech quality as its guide. Several insights are provided for future versions of the proposed system to speed up convergence and further reduce the speech quality estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
