Where Norms and References Collide: Evaluating LLMs on Normative Reasoning

Mitchell Abrams; Kaveh Eskandari Miandoab; Felix Gervits; Vasanth Sarathy; Matthias Scheutz

arXiv:2602.02975·cs.CL·February 4, 2026

Where Norms and References Collide: Evaluating LLMs on Normative Reasoning

Mitchell Abrams, Kaveh Eskandari Miandoab, Felix Gervits, Vasanth Sarathy, Matthias Scheutz

PDF

Open Access 1 Video

TL;DR

This paper evaluates whether large language models can understand and apply social norms in contextually grounded tasks, revealing significant limitations in their ability to handle implicit, underspecified, or conflicting norms.

Contribution

Introduces SNIC, a diagnostic testbed for assessing LLMs' ability to reason over social norms in embodied, real-world scenarios, highlighting current model shortcomings.

Findings

01

LLMs struggle with implicit norms

02

Models have difficulty with underspecified norms

03

Conflicting norms pose challenges for LLMs

Abstract

Embodied agents, such as robots, will need to interact in situated environments where successful communication often depends on reasoning over social norms: shared expectations that constrain what actions are appropriate in context. A key capability in such settings is norm-based reference resolution (NBRR), where interpreting referential expressions requires inferring implicit normative expectations grounded in physical and social context. Yet it remains unclear whether Large Language Models (LLMs) can support this kind of reasoning. In this work, we introduce SNIC (Situated Norms in Context), a human-validated diagnostic testbed designed to probe how well state-of-the-art LLMs can extract and utilize normative principles relevant to NBRR. SNIC emphasizes physically grounded norms that arise in everyday tasks such as cleaning, tidying, and serving. Across a range of controlled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Where Norms and References Collide: Evaluating LLMs on Normative Reasoning· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Social Robot Interaction and HRI