PWESuite: Phonetic Word Embeddings and Tasks They Facilitate
Vil\'em Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson,, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen

TL;DR
This paper introduces three phonetic word embedding methods based on articulatory features and provides a comprehensive task suite for fair evaluation, aiming to advance phonetic information integration in NLP applications.
Contribution
It develops novel phonetic embedding methods and establishes a standardized evaluation framework for phonetic word embeddings in NLP.
Findings
Phonetic embeddings improve rhyme and sound similarity tasks.
The evaluation suite enables fair comparison of phonetic embedding methods.
Results show benefits of phonetic information in specific NLP tasks.
Abstract
Mapping words into a fixed-dimensional vector space is the backbone of modern NLP. While most word embedding methods successfully encode semantic information, they overlook phonetic information that is crucial for many tasks. We develop three methods that use articulatory features to build phonetically informed word embeddings. To address the inconsistent evaluation of existing phonetic word embedding methods, we also contribute a task suite to fairly evaluate past, current, and future methods. We evaluate both (1) intrinsic aspects of phonetic word embeddings, such as word retrieval and correlation with sound similarity, and (2) extrinsic performance on tasks such as rhyme and cognate detection and sound analogies. We hope our task suite will promote reproducibility and inspire future phonetic embedding research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques
