Mirror-Neuron Patterns in AI Alignment
Robyn Wyrick

TL;DR
This paper explores whether artificial neural networks can develop mirror-neuron-like patterns that support empathy and cooperation, potentially enhancing AI alignment with human values through intrinsic social cognition mechanisms.
Contribution
It demonstrates that scaled ANNs with self/other coupling can develop mirror-neuron patterns, proposing a new framework for intrinsic AI alignment via empathy-like circuits.
Findings
Mirror-neuron patterns emerge in scaled ANNs with self/other coupling.
These patterns support cooperative behaviors in AI agents.
The study introduces the Checkpoint Mirror Neuron Index (CMNI) for quantification.
Abstract
As artificial intelligence (AI) advances toward superhuman capabilities, aligning these systems with human values becomes increasingly critical. Current alignment strategies rely largely on externally specified constraints that may prove insufficient against future super-intelligent AI capable of circumventing top-down controls. This research investigates whether artificial neural networks (ANNs) can develop patterns analogous to biological mirror neurons cells that activate both when performing and observing actions, and how such patterns might contribute to intrinsic alignment in AI. Mirror neurons play a crucial role in empathy, imitation, and social cognition in humans. The study therefore asks: (1) Can simple ANNs develop mirror-neuron patterns? and (2) How might these patterns contribute to ethical and cooperative decision-making in AI systems? Using a novel Frog and Toad game…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAction Observation and Synchronization · Embodied and Extended Cognition · Face Recognition and Perception
