The End of Reward Engineering: How LLMs Are Redefining Multi-Agent Coordination

Haoran Su; Yandong Sun; Congjia Yu

arXiv:2601.08237·cs.AI·January 14, 2026

The End of Reward Engineering: How LLMs Are Redefining Multi-Agent Coordination

Haoran Su, Yandong Sun, Congjia Yu

PDF

Open Access

TL;DR

This paper discusses how large language models are transforming multi-agent coordination by shifting from manual reward engineering to language-based objectives, addressing key challenges and outlining future research directions.

Contribution

It introduces a paradigm shift towards language-mediated reward specification and adaptation in multi-agent systems, moving beyond traditional reward engineering methods.

Findings

01

LLMs can synthesize reward functions from natural language.

02

Language-based supervision can replace traditional reward engineering.

03

Shared semantic representations can facilitate coordination without explicit rewards.

Abstract

Reward engineering, the manual specification of reward functions to induce desired agent behavior, remains a fundamental challenge in multi-agent reinforcement learning. This difficulty is amplified by credit assignment ambiguity, environmental non-stationarity, and the combinatorial growth of interaction complexity. We argue that recent advances in large language models (LLMs) point toward a shift from hand-crafted numerical rewards to language-based objective specifications. Prior work has shown that LLMs can synthesize reward functions directly from natural language descriptions (e.g., EUREKA) and adapt reward formulations online with minimal human intervention (e.g., CARD). In parallel, the emerging paradigm of Reinforcement Learning from Verifiable Rewards (RLVR) provides empirical evidence that language-mediated supervision can serve as a viable alternative to traditional reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Embodied and Extended Cognition