TL;DR
This paper introduces C3, a method for exact credit assignment in cooperative multi-agent LLM systems, leveraging deterministic interaction histories to enable unbiased evaluation without approximation.
Contribution
It presents a novel, exact, and computationally efficient credit assignment technique for cooperative LLM agents that outperforms existing approximate methods.
Findings
C3 consistently outperforms all baselines across six benchmarks.
Exact credit assignment reduces training token consumption.
Structural properties enable both credit assignment and verification.
Abstract
Removing an agent from a cooperative team to measure its contribution seems natural, yet in multi-agent LLM systems this evaluation distorts the result it claims to measure. This failure is not isolated: learned critics, trajectory-level baselines, and agent-removal counterfactuals all inherit from standard multi-agent reinforcement learning a premise that exact counterfactual evaluation requires privileged environment access, and therefore approximate. In cooperative LLM systems, this premise is false. Interaction histories are deterministic functions of observable text with no hidden state, so any decision point can be restored exactly, making direct causal measurement possible without parametric approximation. C3 exploits this property by fixing the complete history at each decision point, sampling alternative actions under a frozen behavior policy, and computing unbiased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
