The Development of a Comprehensive Spanish Dictionary for Phonetic and   Lexical Tagging in Socio-phonetic Research (ESPADA)

Simon Gonzalez

arXiv:2407.15375·cs.CL·July 23, 2024

The Development of a Comprehensive Spanish Dictionary for Phonetic and Lexical Tagging in Socio-phonetic Research (ESPADA)

Simon Gonzalez

PDF

Open Access

TL;DR

This paper introduces ESPADA, a comprehensive open-source Spanish pronunciation dictionary with over 628,000 entries covering 16 countries, designed to improve phonetic and lexical analysis in socio-phonetic research.

Contribution

The creation of ESPADA, the largest and most inclusive Spanish pronunciation dictionary, integrating dialectal variations, morphological, lexical, and phonetic annotations for socio-phonetic studies.

Findings

01

Over 628,000 entries covering 16 countries

02

Enhanced dialectal and phonetic analysis capabilities

03

Open-source resource for socio-phonetic research

Abstract

Pronunciation dictionaries are an important component in the process of speech forced alignment. The accuracy of these dictionaries has a strong effect on the aligned speech data since they help the mapping between orthographic transcriptions and acoustic signals. In this paper, I present the creation of a comprehensive pronunciation dictionary in Spanish (ESPADA) that can be used in most of the dialect variants of Spanish data. Current dictionaries focus on specific regional variants, but with the flexible nature of our tool, it can be readily applied to capture the most common phonetic differences across major dialectal variants. We propose improvements to current pronunciation dictionaries as well as mapping other relevant annotations such as morphological and lexical information. In terms of size, it is currently the most complete dictionary with more than 628,000 entries,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpanish Linguistics and Language Studies · Linguistic Studies and Language Acquisition

MethodsFocus