Cooperative Profiles Predict Multi-Agent LLM Team Performance in AI for Science Workflows

Shivani Kumar; Adarsh Bharathwaj; David Jurgens

arXiv:2604.20658·cs.CL·May 8, 2026

Cooperative Profiles Predict Multi-Agent LLM Team Performance in AI for Science Workflows

Shivani Kumar, Adarsh Bharathwaj, David Jurgens

PDF

TL;DR

This paper demonstrates that cooperative behavior profiles derived from behavioral economics games can predict the performance of multi-agent LLM teams in scientific workflows, offering a diagnostic tool for assessing cooperation.

Contribution

It introduces a benchmarking approach linking game-based cooperative profiles of LLMs to their effectiveness in collaborative scientific tasks.

Findings

01

Game-derived cooperative profiles predict downstream scientific performance.

02

Models investing in team production outperform greedy strategies.

03

Cooperative disposition is a measurable property independent of general ability.

Abstract

Multi-agent systems built from teams of large language models (LLMs) are increasingly deployed for collaborative scientific reasoning and problem-solving. These systems require agents to coordinate under shared constraints, such as GPUs or credit balances, where cooperative behavior matters. Behavioral economics provides a rich toolkit of games that isolate distinct cooperation mechanisms, yet it remains unknown whether a model's behavior in these stylized settings predicts its performance in realistic collaborative tasks. Here, we benchmark 35 open-weight LLMs across six behavioral economics games and show that game-derived cooperative profiles robustly predict downstream performance in AI-for-Science tasks, where teams of LLM agents collaboratively analyze data, build models, and produce scientific reports under shared budget constraints. Models that effectively coordinate games and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.