Phase-based Information for Voice Pathology Detection
Thomas Drugman, Thomas Dubuisson, Thierry Dutoit

TL;DR
This paper explores the use of phase-based features, particularly group delay functions, for voice disorder detection, demonstrating their high discriminative power and complementarity to magnitude spectrum features.
Contribution
It introduces phase-based features for voice pathology detection and compares their effectiveness to traditional magnitude spectrum features, highlighting their complementary nature.
Findings
Phase-based features effectively characterize phonation irregularities.
They provide high discrimination performance in voice disorder detection.
Phase features complement magnitude spectrum features.
Abstract
In most current approaches of speech processing, information is extracted from the magnitude spectrum. However recent perceptual studies have underlined the importance of the phase component. The goal of this paper is to investigate the potential of using phase-based features for automatically detecting voice disorders. It is shown that group delay functions are appropriate for characterizing irregularities in the phonation. Besides the respect of the mixed-phase model of speech is discussed. The proposed phase-based features are evaluated and compared to other parameters derived from the magnitude spectrum. Both streams are shown to be interestingly complementary. Furthermore phase-based features turn out to convey a great amount of relevant information, leading to high discrimination performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
