Automated Meta Prompt Engineering for Alignment with the Theory of Mind

Aaron Baughman; Rahul Agarwal; Eduardo Morales; Gozde Akay

arXiv:2505.09024·cs.AI·May 15, 2025

Automated Meta Prompt Engineering for Alignment with the Theory of Mind

Aaron Baughman, Rahul Agarwal, Eduardo Morales, Gozde Akay

PDF

TL;DR

This paper presents a novel meta-prompting approach using agentic reinforcement learning to align AI-generated content with human mental expectations, demonstrated through live sports event content creation.

Contribution

It introduces a method for optimizing LLMs to anticipate and incorporate human edits, improving content alignment with human mental models in real-time settings.

Findings

01

Achieved 53.8% alignment with human content reviewers.

02

Increased content quality by extending tennis action coverage.

03

Deployed successfully at US Open 2024 and other live events.

Abstract

We introduce a method of meta-prompting that jointly produces fluent text for complex tasks while optimizing the similarity of neural states between a human's mental expectation and a Large Language Model's (LLM) neural processing. A technique of agentic reinforcement learning is applied, in which an LLM as a Judge (LLMaaJ) teaches another LLM, through in-context learning, how to produce content by interpreting the intended and unintended generated text traits. To measure human mental beliefs around content production, users modify long form AI-generated text articles before publication at the US Open 2024 tennis Grand Slam. Now, an LLMaaJ can solve the Theory of Mind (ToM) alignment problem by anticipating and including human edits within the creation of text from an LLM. Throughout experimentation and by interpreting the results of a live production system, the expectations of human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.