Theory Trace Card: Theory-Driven Socio-Cognitive Evaluation of LLMs
Farzan Karimi-Malekabadi, Suhaib Abdurahman, Zhivar Sourati, Jackson Trager, and Morteza Dehghani

TL;DR
This paper introduces the Theory Trace Card, a framework for explicitly linking socio-cognitive evaluations of large language models to their underlying theories, improving interpretability and validity of benchmark results.
Contribution
It formalizes the theory gap in socio-cognitive evaluations and proposes the TTC as a lightweight artifact to explicitly document theoretical assumptions and limitations.
Findings
TTC improves clarity and interpretability of evaluations.
Formalization of the theory gap highlights systemic validity issues.
TTC facilitates better understanding of what benchmarks measure.
Abstract
Socio-cognitive benchmarks for large language models (LLMs) often fail to predict real-world behavior, even when models achieve high benchmark scores. Prior work has attributed this evaluation-deployment gap to problems of measurement and validity. While these critiques are insightful, we argue that they overlook a more fundamental issue: many socio-cognitive evaluations proceed without an explicit theoretical specification of the target capability, leaving the assumptions linking task performance to competence implicit. Without this theoretical grounding, benchmarks that exercise only narrow subsets of a capability are routinely misinterpreted as evidence of broad competence: a gap that creates a systemic validity illusion by masking the failure to evaluate the capability's other essential dimensions. To address this gap, we make two contributions. First, we diagnose and formalize this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
