Dirichlet Mixture Model based VQ Performance Prediction for Line   Spectral Frequency

Zhanyu Ma

arXiv:1808.00818·cs.LG·February 4, 2020

Dirichlet Mixture Model based VQ Performance Prediction for Line Spectral Frequency

Zhanyu Ma

PDF

Open Access

TL;DR

This paper models the distribution of line spectral frequency parameters using a Dirichlet mixture model to predict vector quantization performance and estimate the bit rate needed for transparent speech coding.

Contribution

It introduces a DMM-based approach to derive performance bounds and links spectral distortion to MSE for improved LSF vector quantization analysis.

Findings

01

Derived performance bounds for LSF VQ using DMM.

02

Established a polynomial mapping between LSD and MSE.

03

Estimated minimum bit rate for transparent coding.

Abstract

In this paper, we continue our previous work on the Dirichlet mixture model (DMM)-based VQ to derive the performance bound of the LSF VQ. The LSF parameters are transformed into the $Δ$ LSF domain and the underlying distribution of the $Δ$ LSF parameters are modelled by a DMM with finite number of mixture components. The quantization distortion, in terms of the mean squared error (MSE), is calculated with the high rate theory. The mapping relation between the perceptually motivated log spectral distortion (LSD) and the MSE is empirically approximated by a polynomial. With this mapping function, the minimum required bit rate for transparent coding of the LSF is estimated.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Image and Video Quality Assessment · Video Coding and Compression Technologies