THiNK: Can Large Language Models Think-aloud?

Yongan Yu; Mengqian Wu; Yiran Lin; Nikki G. Lobczowski

arXiv:2505.20184·cs.CL·May 27, 2025

THiNK: Can Large Language Models Think-aloud?

Yongan Yu, Mengqian Wu, Yiran Lin, Nikki G. Lobczowski

PDF

Open Access 1 Repo

TL;DR

THiNK is a multi-agent, feedback-driven framework based on Bloom's Taxonomy that systematically evaluates and enhances the reasoning skills of large language models through iterative reflection and refinement.

Contribution

We introduce THiNK, a novel evaluation framework that assesses and improves both lower- and higher-order thinking skills in LLMs using iterative problem generation, critique, and revision.

Findings

01

Models excel at lower-order thinking skills.

02

Models struggle with applying knowledge in realistic contexts.

03

Structured feedback improves higher-order reasoning.

Abstract

Assessing higher-order thinking skills in large language models (LLMs) remains a fundamental challenge, especially in tasks that go beyond surface-level accuracy. In this work, we propose THiNK (Testing Higher-order Notion of Knowledge), a multi-agent, feedback-driven evaluation framework grounded in Bloom's Taxonomy. THiNK frames reasoning assessment as an iterative task of problem generation, critique, and revision, encouraging LLMs to think-aloud through step-by-step reflection and refinement. This enables a systematic evaluation of both lower-order (e.g., remember, understand) and higher-order (e.g., evaluate, create) thinking skills. We apply THiNK to seven state-of-the-art LLMs and perform a detailed cognitive analysis of their outputs. Results reveal that while models reliably perform lower-order categories well, they struggle with applying knowledge in realistic contexts and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michaelyya/think
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsALIGN