Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax

Mohammed Q. Shormani; Yehia A. AlSohbani

arXiv:2603.20114·cs.CL·April 10, 2026

Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax

Mohammed Q. Shormani, Yehia A. AlSohbani

PDF

TL;DR

This study evaluates ChatGPT's ability to translate core syntactic terms into Arabic, revealing significant limitations and proposing collaboration strategies to improve LLMs' linguistic capabilities.

Contribution

It provides empirical evidence of LLMs' current shortcomings in translating syntactic concepts and suggests collaborative approaches for enhancement.

Findings

01

Only 25% of translations were accurate.

02

38.6% of translations were inaccurate.

03

36.4% of translations were partially correct.

Abstract

We aim to examine the extent to which Large Language Models (LLMs) can 'talk much' about grammar modules, providing evidence from syntax core properties translated by ChatGPT into Arabic. We collected 44 terms from generative syntax previous works, including books and journal articles, as well as from our experience in the field. These terms were translated by humans, and then by ChatGPT-5. We then analyzed and compared both translations. We used an analytical and comparative approach in our analysis. Findings unveil that LLMs still cannot 'talk much' about the core syntax properties embedded in the terms under study involving several syntactic and semantic challenges: only 25% of ChatGPT translations were accurate, while 38.6% were inaccurate, and 36.4.% were partially correct, which we consider appropriate. Based on these findings, a set of actionable strategies were proposed, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.