Linguists should learn to love speech-based deep learning models

Marianne de Heer Kloots; Paul Boersma; Willem Zuidema

arXiv:2512.14506·cs.CL·December 17, 2025

Linguists should learn to love speech-based deep learning models

Marianne de Heer Kloots, Paul Boersma, Willem Zuidema

PDF

Open Access

TL;DR

This paper advocates for linguists to embrace speech-based deep learning models, emphasizing their importance over text-based models for understanding human language.

Contribution

It introduces a framework connecting deep learning systems with linguistic theories and highlights the need to focus on audio-based models for linguistic research.

Findings

01

Audio models better capture spoken language nuances

02

Text-based models miss key linguistic features

03

Speech models enhance understanding of human language

Abstract

Futrell and Mahowald present a useful framework bridging technology-oriented deep learning systems and explanation-oriented linguistic theories. Unfortunately, the target article's focus on generative text-based LLMs fundamentally limits fruitful interactions with linguistics, as many interesting questions on human language fall outside what is captured by written text. We argue that audio-based deep learning models can and should play a crucial role.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Topic Modeling