Un-considering Contextual Information: Assessing LLMs' Understanding of Indexical Elements
Metehan Oguz, Yavuz Bakman, Duygu Nur Yaldiz

TL;DR
This paper investigates how large language models interpret indexical words like I, you, here, and tomorrow, revealing varying levels of understanding and the influence of syntactic cues, through the release of a new dataset and evaluation of multiple models.
Contribution
It introduces the first English Indexical Dataset with 1600 questions and evaluates LLMs' understanding of indexicals, highlighting their strengths and limitations in this aspect.
Findings
LLMs perform well on the indexical 'I'
Struggle observed with 'you', 'here', and 'tomorrow'
Syntactic cues can both aid and hinder LLM performance
Abstract
Large Language Models (LLMs) have demonstrated impressive performances in tasks related to coreference resolution. However, previous studies mostly assessed LLM performance on coreference resolution with nouns and third person pronouns. This study evaluates LLM performance on coreference resolution with indexical like I, you, here and tomorrow, which come with unique challenges due to their linguistic properties. We present the first study examining how LLMs interpret indexicals in English, releasing the English Indexical Dataset with 1600 multiple-choice questions. We evaluate pioneering LLMs, including GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, and DeepSeek V3. Our results reveal that LLMs exhibit an impressive performance with some indexicals (I), while struggling with others (you, here, tomorrow), and that syntactic cues (e.g. quotation) contribute to LLM performance with some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMathematics, Computing, and Information Processing · Library Science and Information Systems
