Statistical Speech Model Description with VMF Mixture Model

Zhanyu Ma; Arne Leijon

arXiv:1808.00960·cs.SD·February 4, 2020·1 cites

Statistical Speech Model Description with VMF Mixture Model

Zhanyu Ma, Arne Leijon

PDF

Open Access

TL;DR

This paper introduces a von Mises-Fisher mixture model for statistical speech modeling using unit vector LSF parameters, deriving an optimal bit allocation strategy and demonstrating superior performance over existing methods.

Contribution

It proposes a novel VMM-based approach for speech parameter modeling and develops a new distortion-rate relation for VQ, improving speech coding efficiency.

Findings

01

VVQ outperforms DVQ and GVQ in experiments

02

Optimal inter-component bit allocation enhances coding efficiency

03

Derived D-R relation for VMM-based VQ

Abstract

In this paper, we present the LSF parameters by a unit vector form, which has directional characteristics. The underlying distribution of this unit vector variable is modeled by a von Mises-Fisher mixture model (VMM). With the high rate theory, the optimal inter-component bit allocation strategy is proposed and the distortion-rate (D-R) relation is derived for the VMM based-VQ (VVQ). Experimental results show that the VVQ outperforms our recently introduced DVQ and the conventional GVQ.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Speech and Audio Processing · Advanced Data Compression Techniques