On the difficulty of a distributional semantics of spoken language
Grzegorz Chrupa{\l}a, Lieke Gelderloos, \'Akos K\'ad\'ar, Afra, Alishahi

TL;DR
This paper explores the challenges of developing distributional semantic models for spoken language, highlighting the difficulties and potential approaches to abstract from surface variability in speech to learn meaningful representations.
Contribution
It investigates the adaptation of unsupervised semantic learning methods from written to spoken language and evaluates simple models on synthetic and human speech datasets.
Findings
Models learn some semantic representations from synthetic speech
Results on human speech are inconclusive
Discusses inherent challenges in natural spoken language semantics
Abstract
In the domain of unsupervised learning most work on speech has focused on discovering low-level constructs such as phoneme inventories or word-like units. In contrast, for written language, where there is a large body of work on unsupervised induction of semantic representations of words, whole sentences and longer texts. In this study we examine the challenges of adapting these approaches from written to spoken language. We conjecture that unsupervised learning of the semantics of spoken language becomes feasible if we abstract from the surface variability. We simulate this setting with a dataset of utterances spoken by a realistic but uniform synthetic voice. We evaluate two simple unsupervised models which, to varying degrees of success, learn semantic representations of speech fragments. Finally we present inconclusive results on human speech, and discuss the challenges inherent in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
