An Algorithmic Theory of Metacognition in Minds and Machines
Rylan Schaeffer

TL;DR
This paper introduces a new algorithmic theory of metacognition in humans and machines, explaining error detection and creating a novel Actor-Critic agent that can identify its own suboptimal actions without external cues.
Contribution
It proposes a theory linking metacognition to RL trade-offs and develops the Metacognitive Actor Critic (MAC), a new agent architecture that exhibits self-detection of suboptimal actions.
Findings
MAC can detect some suboptimal actions without external info
The theory links RL trade-offs to metacognitive processes
Deep MAC demonstrates metacognitive capabilities in practice
Abstract
Humans sometimes choose actions that they themselves can identify as sub-optimal, or wrong, even in the absence of additional information. How is this possible? We present an algorithmic theory of metacognition based on a well-understood trade-off in reinforcement learning (RL) between value-based RL and policy-based RL. To the cognitive (neuro)science community, our theory answers the outstanding question of why information can be used for error detection but not for action selection. To the machine learning community, our proposed theory creates a novel interaction between the Actor and Critic in Actor-Critic agents and notes a novel connection between RL and Bayesian Optimization. We call our proposed agent the Metacognitive Actor Critic (MAC). We conclude with showing how to create metacognition in machines by implementing a deep MAC and showing that it can detect (some of) its own…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural and Behavioral Psychology Studies · Reinforcement Learning in Robotics · Embodied and Extended Cognition
