AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement

J Rosser; Jakob Foerster

arXiv:2502.00757·cs.CR·October 15, 2025

AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement

J Rosser, Jakob Foerster

PDF

Open Access 1 Repo

TL;DR

AgentBreeder is a framework that uses evolutionary search to improve multi-agent scaffolds, significantly enhancing safety performance while balancing capabilities, and highlighting potential risks of multi-agent systems.

Contribution

It introduces a novel self-improving evolutionary framework for multi-agent scaffolds, addressing safety concerns and demonstrating effectiveness on benchmark tasks.

Findings

01

79.4% safety performance uplift in 'blue' mode

02

Emergence of adversarial scaffolds in 'red' mode

03

Framework balances safety and capability improvements

Abstract

Scaffolding Large Language Models (LLMs) into multi-agent systems often improves performance on complex tasks, but the safety impact of such scaffolds has not been thoroughly explored. We introduce AgentBreeder, a framework for multi-objective self-improving evolutionary search over scaffolds. We evaluate discovered scaffolds on widely recognized reasoning, mathematics, and safety benchmarks and compare them with popular baselines. In "blue" mode, we see a 79.4% average uplift in safety benchmark performance while maintaining or improving capability scores. In "red" mode, we find adversarially weak scaffolds emerging concurrently with capability optimization. Our work demonstrates the risks of multi-agent scaffolding and provides a framework for mitigating them. Code is available at https://github.com/jrosseruk/AgentBreeder.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

J-Rosser-UK/AgentBreeder
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation

MethodsBalanced Selection