Emergence of a phonological bias in ChatGPT
Juan Manuel Toro

TL;DR
This paper demonstrates that ChatGPT exhibits a human-like phonological bias, specifically a consonant bias, across multiple languages, despite differences in training and language acquisition processes.
Contribution
It reveals that large language models develop phonological biases similar to humans, highlighting emergent properties of AI language processing.
Findings
ChatGPT shows a consonant bias in word usage.
The bias appears across English and Spanish.
Phonological biases emerge despite different training methods.
Abstract
Current large language models, such as OpenAI's ChatGPT, have captured the public's attention because how remarkable they are in the use of language. Here, I demonstrate that ChatGPT displays phonological biases that are a hallmark of human language processing. More concretely, just like humans, ChatGPT has a consonant bias. That is, the chatbot has a tendency to use consonants over vowels to identify words. This is observed across languages that differ in their relative distribution of consonants and vowels such as English and Spanish. Despite the differences in how current artificial intelligence language models are trained to process linguistic stimuli and how human infants acquire language, such training seems to be enough for the emergence of a phonological bias in ChatGPT
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Text Readability and Simplification
