The Emergence of Complex Behavior in Large-Scale Ecological Environments

Joseph Bejjani; Chase Van Amburg; Chengrui Wang; Chloe Huangyuan Su; Sarah M. Pratt; Yasin Mazloumi; Naeem Khoshnevis; Sham M. Kakade; Kiant\'e Brantley; Aaron Walsman

arXiv:2510.18221·cs.MA·December 15, 2025

The Emergence of Complex Behavior in Large-Scale Ecological Environments

Joseph Bejjani, Chase Van Amburg, Chengrui Wang, Chloe Huangyuan Su, Sarah M. Pratt, Yasin Mazloumi, Naeem Khoshnevis, Sham M. Kakade, Kiant\'e Brantley, Aaron Walsman

PDF

Open Access 3 Reviews

TL;DR

This paper investigates how large-scale ecological environments with many agents lead to the emergence of complex behaviors through evolution, using advanced simulation and hardware to analyze effects of scale and sensing.

Contribution

It introduces a scalable multi-agent simulation framework to study emergent behaviors in large populations without explicit rewards, highlighting the role of environmental scale and sensing.

Findings

01

Emergent behaviors like resource extraction, foraging, and predation observed.

02

Larger populations and environments increase behavior stability and diversity.

03

Scaling up enhances the potential for ecological-inspired machine learning methods.

Abstract

We explore how physical scale and population size shape the emergence of complex behaviors in open-ended ecological environments. In our setting, agents are unsupervised and have no explicit rewards or learning objectives but instead evolve over time according to reproduction, mutation, and selection. As they act, agents also shape their environment and the population around them in an ongoing dynamic ecology. Our goal is not to optimize a single high-performance policy, but instead to examine how behaviors emerge and evolve across large populations due to natural competition and environmental pressures. We use modern hardware along with a new multi-agent simulator to scale the environment and population to sizes much larger than previously attempted, reaching populations of over 60,000 agents, each with their own evolved neural network policy. We identify various emergent behaviors…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

The authors developed a new simulation environment to allow rapid evaluation of large grid worlds, where they conduct experiments with populations of more than 60,000 individual agents, each with their own evolved neural network policy. They observe some behaviours they refer to as emergent, that arise more commonly with scale. Specifically these are: the ability of agents to travel long distances inland in order to find resources (and in fact to go back and forth to water, a behaviour they call

Weaknesses

Contextualization with respect to other work is insufficient. In particular, there is no information in the experimental results discussion about how such observations compare to ones in related work. This would be important if this paper is meant as a contribution on the (computational) ecology side, but it's also crucual in order to assess the strength of their simulation, e.g. it allows them to remedy specific weaknesses in previous work. The Related Work section near the start of the paper

Reviewer 02Rating 2Confidence 4

Strengths

- The paper introduces a jax based environment which is easy to scale on a GPU/multi-GPU setup to enable faster data generation for studying emergence. - Environment looks like something that can be easily visualized which is a great for developers/researchers in the future.

Weaknesses

- Overall I feel this paper lacks significantly novelty. Perhaps one novelty is that this is a large-scale environment with 60,000 agents with some interesting game rules (although they seem similar to neural MMO e.g. foraging). Another concern is that the majority of this paper is spent 1. explaining the game and rules followed by 2. analyzing what happens if you evolve all these different (memoryless) agents and what behaviors emerge. There is not much discussion around how these results would

Reviewer 03Rating 6Confidence 3

Strengths

1. Fresh perspective on open-ended learning. The paper reframes “intelligence without reward” as a scalable ecological process. The analogy between ecological and model scaling is elegant and offers a new lens on emergence in machine learning. 2. The authors manage to simulate up to 60 000 agents in a large heterogeneous world with realistic resource flow, physics, and reproduction—all efficiently implemented in JAX. 3. Convincing qualitative behaviors. Emergent patterns—migratory resource tr

Weaknesses

1. Lack of quantitative behavioral metrics. Claims about “emergence” are supported mainly by qualitative observation. There are no explicit metrics for behavioral diversity, complexity, or ecological stability. 2. Minimal evolutionary mechanism. The use of pure mutation without crossover or explicit selection limits interpretability of the evolutionary dynamics. It’s unclear whether complexity arises from environment pressure alone or random drift. 3. Missing ablations and baselines. The paper

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolution and Genetic Dynamics · Evolutionary Game Theory and Cooperation