Towards Understanding the Cognitive Habits of Large Reasoning Models
Jianshuo Dong, Yujia Fu, Chuanrui Hu, Chao Zhang, Han Qiu

TL;DR
This paper introduces CogTest, a benchmark to evaluate cognitive habits in large reasoning models, revealing their human-like behaviors and links to safety issues, advancing understanding of model reasoning and misbehavior.
Contribution
We develop CogTest, a novel benchmark for assessing cognitive habits in LRMs, and provide comprehensive analysis of their behavioral patterns and safety implications.
Findings
LRMs exhibit human-like cognitive habits
LRMs adaptively deploy habits across tasks
Certain habits correlate with harmful responses
Abstract
Large Reasoning Models (LRMs), which autonomously produce a reasoning Chain of Thought (CoT) before producing final responses, offer a promising approach to interpreting and monitoring model behaviors. Inspired by the observation that certain CoT patterns -- e.g., ``Wait, did I miss anything?'' -- consistently emerge across tasks, we explore whether LRMs exhibit human-like cognitive habits. Building on Habits of Mind, a well-established framework of cognitive habits associated with successful human problem-solving, we introduce CogTest, a principled benchmark designed to evaluate LRMs' cognitive habits. CogTest includes 16 cognitive habits, each instantiated with 25 diverse tasks, and employs an evidence-first extraction method to ensure reliable habit identification. With CogTest, we conduct a comprehensive evaluation of 16 widely used LLMs (13 LRMs and 3 non-reasoning ones). Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Semantic Web and Ontologies · Bayesian Modeling and Causal Inference
