Beyond Pattern Recognition: Probing Mental Representations of LMs
Moritz Miller, Kumar Shridhar

TL;DR
This paper investigates whether language models develop dynamic mental representations during reasoning or merely recognize patterns, by assessing their incremental understanding across text and multimodal tasks.
Contribution
It introduces a novel method to evaluate LM mental modeling through incremental problem presentation, contrasting with traditional prompt-based approaches.
Findings
LMs struggle to form true mental representations during reasoning.
Incremental problem presentation reveals limitations in LMs' internal modeling.
Multimodal LMs face similar challenges as text-only models.
Abstract
Language Models (LMs) have demonstrated impressive capabilities in solving complex reasoning tasks, particularly when prompted to generate intermediate explanations. However, it remains an open question whether these intermediate reasoning traces represent a dynamic, evolving thought process or merely reflect sophisticated pattern recognition acquired during large scale pre training. Drawing inspiration from human cognition, where reasoning unfolds incrementally as new information is assimilated and internal models are continuously updated, we propose to delve deeper into the mental model of various LMs. We propose a new way to assess the mental modeling of LMs, where they are provided with problem details gradually, allowing each new piece of data to build upon and refine the model's internal representation of the task. We systematically compare this step by step mental modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Teaching and Learning Methods · Educational and Psychological Assessments · EFL/ESL Teaching and Learning
