Exploration Through Introspection: A Self-Aware Reward Model

Michael Petrowski; Milica Ga\v{s}i\'c

arXiv:2601.03389·cs.AI·January 8, 2026

Exploration Through Introspection: A Self-Aware Reward Model

Michael Petrowski, Milica Ga\v{s}i\'c

PDF

Open Access

TL;DR

This paper introduces a self-aware reinforcement learning framework where agents infer their internal states using introspection, leading to improved learning performance and human-like behaviors in gridworld environments.

Contribution

It presents a novel introspective exploration method using a hidden Markov model to simulate self-awareness and pain perception in reinforcement learning agents.

Findings

01

Introspective agents outperform baseline agents in learning tasks.

02

Agents can replicate complex human-like behaviors.

03

Self-awareness influences agent performance under different pain models.

Abstract

Understanding how artificial agents model internal mental states is central to advancing Theory of Mind in AI. Evidence points to a unified system for self- and other-awareness. We explore this self-awareness by having reinforcement learning agents infer their own internal states in gridworld environments. Specifically, we introduce an introspective exploration component that is inspired by biological pain as a learning signal by utilizing a hidden Markov model to infer "pain-belief" from online observations. This signal is integrated into a subjective reward function to study how self-awareness affects the agent's learning abilities. Further, we use this computational framework to investigate the difference in performance between normal and chronic pain perception models. Results show that introspective agents in general significantly outperform standard baseline agents and can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbodied and Extended Cognition · Emotion and Mood Recognition · Social Robot Interaction and HRI