Statistical Speech Model Description with VMF Mixture Model
Zhanyu Ma, Arne Leijon

TL;DR
This paper introduces a von Mises-Fisher mixture model for statistical speech modeling using unit vector LSF parameters, deriving an optimal bit allocation strategy and demonstrating superior performance over existing methods.
Contribution
It proposes a novel VMM-based approach for speech parameter modeling and develops a new distortion-rate relation for VQ, improving speech coding efficiency.
Findings
VVQ outperforms DVQ and GVQ in experiments
Optimal inter-component bit allocation enhances coding efficiency
Derived D-R relation for VMM-based VQ
Abstract
In this paper, we present the LSF parameters by a unit vector form, which has directional characteristics. The underlying distribution of this unit vector variable is modeled by a von Mises-Fisher mixture model (VMM). With the high rate theory, the optimal inter-component bit allocation strategy is proposed and the distortion-rate (D-R) relation is derived for the VMM based-VQ (VVQ). Experimental results show that the VVQ outperforms our recently introduced DVQ and the conventional GVQ.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Speech and Audio Processing · Advanced Data Compression Techniques
