Adaptive dictionary based approach for background noise and speaker classification and subsequent source separation
K V Vijay Girish, A G Ramakrishnan, T V Ananthapadmanabha

TL;DR
This paper presents an adaptive dictionary-based hierarchical approach for classifying background noise and speakers in noisy conversations, improving recognition and separation performance especially at low SNRs.
Contribution
It introduces a novel hierarchical method combining dictionary learning, block sparsity, and source recovery, with adaptive dictionary updates for unknown sources.
Findings
Speaker recognition rate improves by around 15% at 0 dB SNR.
Signal to distortion ratio improves by up to 10% at 0 dB SNR.
Adaptive dictionary learning performs well even when speakers or noises are outside trained sets.
Abstract
A judicious combination of dictionary learning methods, block sparsity and source recovery algorithm are used in a hierarchical manner to identify the noises and the speakers from a noisy conversation between two people. Conversations are simulated using speech from two speakers, each with a different background noise, with varied SNR values, down to -10 dB. Ten each of randomly chosen male and female speakers from the TIMIT database and all the noise sources from the NOISEX database are used for the simulations. For speaker identification, the relative value of weights recovered is used to select an appropriately small subset of the test data, assumed to contain speech. This novel choice of using varied amounts of test data results in an improvement in the speaker recognition rate of around 15% at SNR of 0 dB. Speech and noise are separated using dictionaries of the estimated speaker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Music and Audio Processing
