Discrete representations in neural models of spoken language

Bertrand Higy; Lieke Gelderloos; Afra Alishahi; Grzegorz; Chrupa{\l}a

arXiv:2105.05582·cs.CL·September 17, 2021

Discrete representations in neural models of spoken language

Bertrand Higy, Lieke Gelderloos, Afra Alishahi, Grzegorz, Chrupa{\l}a

PDF

Open Access 1 Repo

TL;DR

This paper evaluates how different metrics assess discrete neural representations in spoken language models, revealing inconsistencies and limitations in current methods and their correlation with linguistic units.

Contribution

It systematically compares four metrics for analyzing vector-quantized spoken language models and discusses their effectiveness and limitations.

Findings

01

Different metrics yield inconsistent evaluation results.

02

Minimal pair stimuli disadvantage larger discrete inventories.

03

Vector quantization moderately correlates with linguistic units.

Abstract

The distributed and continuous representations used by neural networks are at odds with representations employed in linguistics, which are typically symbolic. Vector quantization has been proposed as a way to induce discrete neural representations that are closer in nature to their linguistic counterparts. However, it is not clear which metrics are the best-suited to analyze such discrete representations. We compare the merits of four commonly used metrics in the context of weakly supervised models of spoken language. We compare the results they show when applied to two different models, while systematically studying the effect of the placement and size of the discretization layer. We find that different evaluation regimes can give inconsistent results. While we can attribute them to the properties of the different metrics in most cases, one point of concern remains: the use of minimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bhigy/discrete-repr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Speech Recognition and Synthesis · Topic Modeling