Clause-Internal or Clause-External? Testing Turkish Reflexive Binding in Adapted versus Chain of Thought Large Language Models
Sercan Karaka\c{s}

TL;DR
This paper compares how two large language models understand Turkish reflexive pronouns, revealing differences in their sensitivity to local versus non-local antecedents through a carefully designed evaluation.
Contribution
It introduces a systematic evaluation method for Turkish reflexive binding in large language models and compares the behavior of a fine-tuned LLaMA-based model with an OpenAI reasoning model.
Findings
Trendyol-LLM favors local antecedents in 70% of cases
OpenAI model shows nearly even distribution between local and long-distance bindings
Results highlight differences in model architecture and training influence on anaphoric dependency understanding
Abstract
This study evaluates whether state-of-the-art large language models capture the binding relations of Turkish reflexive pronouns. We construct a balanced evaluation set of 100 Turkish sentences that systematically pit local against non-local antecedents for the reflexives kendi and kendisi. We compare two contrasting systems: an OpenAI chain-of-thought model optimized for multi-step reasoning and Trendyol-LLM-7B-base-v0.1, a LLaMA 2 derived model extensively fine-tuned on Turkish data. Antecedent choice is assessed using a combined paradigm that integrates sentence-level perplexity with a forced-choice comparison between minimally differing continuations. Overall, Trendyol-LLM favors local bindings in approximately 70 percent of trials, exhibiting a robust locality bias consistent with a preference for structurally proximate antecedents. By contrast, the OpenAI model (o1 Mini)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Natural Language Processing Techniques
