Adapting Like Humans: A Metacognitive Agent with Test-time Reasoning
Yang Li, Zhiyuan He, Yuxuan Huang, Zhuhanling Xiao, Chao Yu, Meng Fang, Kun Shao, Jun Wang

TL;DR
This paper introduces a metacognitive framework for vision-language models that enables test-time learning and adaptation through hierarchical reasoning and memory, significantly improving performance on unseen tasks.
Contribution
It presents MCTR, a novel test-time reasoning framework inspired by human metacognition, with dual modules and memory systems for adaptive reasoning in vision-language models.
Findings
Achieves 9 out of 12 top-1 results on unseen Atari games.
Demonstrates effective test-time adaptation and strategy refinement.
Shows meta-reasoning evolves toward human-like adaptation strategies.
Abstract
Recent Vision-Language Models (VLMs) exhibit strong perceptual reasoning abilities, yet they often struggle to adapt efficiently when encountering novel tasks at test time. In contrast, humans leverage the metacognitive model with memory, enabling continuous strategy refinement through metacognitive control when faced with new challenges. To bridge this gap, we propose metacognitive test-time reasoning (MCTR), a framework that equips models with the ability to learn, adapt, and improve during test time through metacognitive self-updating. Inspired by the dual structure of human metacognition, MCTR comprises meta-level and object-level VLM reasoning modules, each equipped with dedicated memory systems for hierarchical adaptive reasoning. Specifically, MCTR consists of (1) a meta-reasoning module which incrementally builds a structured memory by discovering and storing task-relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Artificial Intelligence in Games
