Maximum Phase Modeling for Sparse Linear Prediction of Speech
Thomas Drugman

TL;DR
This paper introduces a novel maximum-phase modeling technique for sparse linear prediction of speech, addressing the limitation of minimum-phase assumptions and improving sparsity and application effectiveness.
Contribution
It proposes a new maximum-phase modeling approach for speech, enhancing sparse LP and broadening its applicability beyond minimum-phase assumptions.
Findings
Increases sparsity of LP residual signals
Improves speech polarity detection accuracy
Enhances excitation modeling effectiveness
Abstract
Linear prediction (LP) is an ubiquitous analysis method in speech processing. Various studies have focused on sparse LP algorithms by introducing sparsity constraints into the LP framework. Sparse LP has been shown to be effective in several issues related to speech modeling and coding. However, all existing approaches assume the speech signal to be minimum-phase. Because speech is known to be mixed-phase, the resulting residual signal contains a persistent maximum-phase component. The aim of this paper is to propose a novel technique which incorporates a modeling of the maximum-phase contribution of speech, and can be applied to any filter representation. The proposed method is shown to significantly increase the sparsity of the LP residual signal and to be effective in two illustrative applications: speech polarity detection and excitation modeling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Blind Source Separation Techniques
