Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

Rakshit Trivedi; Kartik Sharma; David C Parkes

arXiv:2602.20517·cs.AI·February 25, 2026

Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

Rakshit Trivedi, Kartik Sharma, David C Parkes

PDF

Open Access 1 Video

TL;DR

This paper introduces MIMIC, a novel framework that uses language-based inner speech to improve imitation learning in AI, enabling diverse, faithful, and steerable behaviors in human-AI coordination tasks.

Contribution

MIMIC is the first approach to incorporate language-based inner speech for steering and diversifying imitation learning in AI agents.

Findings

01

Enhanced behavior diversity and fidelity in tasks

02

Effective behavioral steering at inference time

03

No additional demonstration data needed for steering

Abstract

Effective human-AI coordination requires artificial agents capable of exhibiting and responding to human-like behaviors while adapting to changing contexts. Imitation learning has emerged as one of the prominent approaches to build such agents by training them to mimic human-demonstrated behaviors. However, current methods struggle to capture the inherent diversity and non-Markovian nature of human behavior and lack the ability to steer behavior at inference time. Drawing inspiration from the theory of human cognitive processes, where inner speech guides action selection before execution, we propose MIMIC (Modeling Inner Motivations for Imitation and Control), a framework that uses language as an internal representation of behavioral intent. MIMIC employs the novel use of vision-language models as linguistic scaffolding to train a conditional variational autoencoder capable of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination· slideslive

Taxonomy

TopicsSocial Robot Interaction and HRI · Action Observation and Synchronization · Multimodal Machine Learning Applications