Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation
Sercan Karaka\c{s}, Yusuf \c{S}im\c{s}ek

TL;DR
This study examines how source trustworthiness influences Turkish evidential morphology in humans and evaluates whether large language models can replicate this sensitivity, revealing significant differences and challenges in LLM reasoning.
Contribution
It provides the first systematic comparison of source-sensitive evidential reasoning between Turkish speakers and LLMs, highlighting the human-LLM gap and model-dependent behaviors.
Findings
Humans show robust trust effects in evidential morphology based on source reliability.
LLMs exhibit unstable and often reversed trust-sensitive behaviors across models and prompts.
There is a significant gap between human and LLM source-sensitive evidential reasoning.
Abstract
This paper investigates whether source trustworthiness shapes Turkish evidential morphology and whether large language models (LLMs) track this sensitivity. We study the past-domain contrast between -DI and -mIs in controlled cloze contexts where the information source is overtly external, while only its perceived reliability is manipulated (High-Trust vs. Low-Trust). In a human production experiment, native speakers of Turkish show a robust trust effect: High-Trust contexts yield relatively more -DI, whereas Low-Trust contexts yield relatively more -mIs, with the pattern remaining stable across sensitivity analyses. We then evaluate 10 LLMs in three prompting paradigms (open gap-fill, explicit past-tense gap-fill, and forced-choice A/B selection). LLM behavior is highly model- and prompt-dependent: some models show weak or local trust-consistent shifts, but effects are generally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
