SceneGram: Conceptualizing and Describing Tangrams in Scene Context

Simeon Junker; Sina Zarrie{\ss}

arXiv:2506.11631·cs.CL·June 16, 2025

SceneGram: Conceptualizing and Describing Tangrams in Scene Context

Simeon Junker, Sina Zarrie{\ss}

PDF

Open Access 1 Video

TL;DR

This paper introduces SceneGram, a dataset capturing human references to tangram shapes in various scene contexts, enabling analysis of how scene context influences conceptualization and revealing limitations of multimodal LLMs in modeling this variability.

Contribution

The paper presents SceneGram, a novel dataset for studying scene-dependent conceptualization of objects and analyzes the gap between human references and multimodal LLM outputs.

Findings

01

Humans produce diverse references influenced by scene context.

02

Multimodal LLMs lack the richness and variability of human conceptualizations.

03

SceneGram enables systematic analysis of scene-dependent object references.

Abstract

Research on reference and naming suggests that humans can come up with very different ways of conceptualizing and referring to the same object, e.g. the same abstract tangram shape can be a "crab", "sink" or "space ship". Another common assumption in cognitive science is that scene context fundamentally shapes our visual perception of objects and conceptual expectations. This paper contributes SceneGram, a dataset of human references to tangram shapes placed in different scene contexts, allowing for systematic analyses of the effect of scene context on conceptualization. Based on this data, we analyze references to tangram shapes generated by multimodal LLMs, showing that these models do not account for the richness and variability of conceptualizations found in human references.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SceneGram: Conceptualizing and Describing Tangrams in Scene Context· underline

Taxonomy

TopicsCinema and Media Studies