Fundamental Principles of Linguistic Structure are Not Represented by o3
Elliot Murphy, Evelina Leivada, Vittoria Dentella, Fritz Gunther, Gary, Marcus

TL;DR
This paper critically evaluates the o3 language model, revealing its significant shortcomings in understanding and generalizing complex linguistic structures, thus challenging claims of AI reaching human-like linguistic competence.
Contribution
The study provides a comprehensive analysis demonstrating that current deep learning models like o3 fail to grasp fundamental linguistic principles, highlighting limitations in compositional reasoning.
Findings
o3 succeeds on simple surface-level linguistic tests
o3 fails to generalize basic phrase structure rules
o3 cannot effectively handle complex semantic and syntactic violations
Abstract
A core component of a successful artificial general intelligence would be the rapid creation and manipulation of grounded compositional abstractions and the demonstration of expertise in the family of recursive hierarchical syntactic objects necessary for the creative use of human language. We evaluated the recently released o3 model (OpenAI; o3-mini-high) and discovered that while it succeeds on some basic linguistic tests relying on linear, surface statistics (e.g., the Strawberry Test), it fails to generalize basic phrase structure rules; it fails with comparative sentences involving semantically illegal cardinality comparisons ('Escher sentences'); its fails to correctly rate and explain acceptability dynamics; and it fails to distinguish between instructions to generate unacceptable semantic vs. unacceptable syntactic outputs. When tasked with generating simple violations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic research and analysis · Linguistics and Cultural Studies · Lexicography and Language Studies
