What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

William Watson; Nicole Cho; Sumitra Ganesh; Manuela Veloso

arXiv:2602.20300·cs.CL·February 25, 2026

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

William Watson, Nicole Cho, Sumitra Ganesh, Manuela Veloso

PDF

Open Access 1 Video

TL;DR

This paper investigates how linguistic features of user queries influence hallucination rates in large language models, identifying specific features that increase or decrease hallucination risk to inform better query design.

Contribution

It introduces a 22-dimensional linguistic feature vector for queries and demonstrates how these features correlate with hallucination likelihood in large-scale LLM analysis.

Findings

01

Deep clause nesting increases hallucination risk

02

Clear intention grounding reduces hallucination

03

Domain-specific features have mixed effects

Abstract

Large Language Model (LLM) hallucinations are usually treated as defects of the model or its decoding strategy. Drawing on classical linguistics, we argue that a query's form can also shape a listener's (and model's) response. We operationalize this insight by constructing a 22-dimension query feature vector covering clause complexity, lexical rarity, and anaphora, negation, answerability, and intention grounding, all known to affect human comprehension. Using 369,837 real-world queries, we ask: Are there certain types of queries that make hallucination more likely? A large-scale analysis reveals a consistent "risk landscape": certain features such as deep clause nesting and underspecification align with higher hallucination propensity. In contrast, clear intention grounding and answerability align with lower hallucination rates. Others, including domain specificity, show mixed,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance· underline

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Topic Modeling · Text Readability and Simplification