The Emergence of Complex Behavior in Large-Scale Ecological Environments
Joseph Bejjani, Chase Van Amburg, Chengrui Wang, Chloe Huangyuan Su, Sarah M. Pratt, Yasin Mazloumi, Naeem Khoshnevis, Sham M. Kakade, Kiant\'e Brantley, Aaron Walsman

TL;DR
This paper investigates how large-scale ecological environments with many agents lead to the emergence of complex behaviors through evolution, using advanced simulation and hardware to analyze effects of scale and sensing.
Contribution
It introduces a scalable multi-agent simulation framework to study emergent behaviors in large populations without explicit rewards, highlighting the role of environmental scale and sensing.
Findings
Emergent behaviors like resource extraction, foraging, and predation observed.
Larger populations and environments increase behavior stability and diversity.
Scaling up enhances the potential for ecological-inspired machine learning methods.
Abstract
We explore how physical scale and population size shape the emergence of complex behaviors in open-ended ecological environments. In our setting, agents are unsupervised and have no explicit rewards or learning objectives but instead evolve over time according to reproduction, mutation, and selection. As they act, agents also shape their environment and the population around them in an ongoing dynamic ecology. Our goal is not to optimize a single high-performance policy, but instead to examine how behaviors emerge and evolve across large populations due to natural competition and environmental pressures. We use modern hardware along with a new multi-agent simulator to scale the environment and population to sizes much larger than previously attempted, reaching populations of over 60,000 agents, each with their own evolved neural network policy. We identify various emergent behaviors…
Peer Reviews
Decision·Submitted to ICLR 2026
The authors developed a new simulation environment to allow rapid evaluation of large grid worlds, where they conduct experiments with populations of more than 60,000 individual agents, each with their own evolved neural network policy. They observe some behaviours they refer to as emergent, that arise more commonly with scale. Specifically these are: the ability of agents to travel long distances inland in order to find resources (and in fact to go back and forth to water, a behaviour they call
Contextualization with respect to other work is insufficient. In particular, there is no information in the experimental results discussion about how such observations compare to ones in related work. This would be important if this paper is meant as a contribution on the (computational) ecology side, but it's also crucual in order to assess the strength of their simulation, e.g. it allows them to remedy specific weaknesses in previous work. The Related Work section near the start of the paper
- The paper introduces a jax based environment which is easy to scale on a GPU/multi-GPU setup to enable faster data generation for studying emergence. - Environment looks like something that can be easily visualized which is a great for developers/researchers in the future.
- Overall I feel this paper lacks significantly novelty. Perhaps one novelty is that this is a large-scale environment with 60,000 agents with some interesting game rules (although they seem similar to neural MMO e.g. foraging). Another concern is that the majority of this paper is spent 1. explaining the game and rules followed by 2. analyzing what happens if you evolve all these different (memoryless) agents and what behaviors emerge. There is not much discussion around how these results would
1. Fresh perspective on open-ended learning. The paper reframes “intelligence without reward” as a scalable ecological process. The analogy between ecological and model scaling is elegant and offers a new lens on emergence in machine learning. 2. The authors manage to simulate up to 60 000 agents in a large heterogeneous world with realistic resource flow, physics, and reproduction—all efficiently implemented in JAX. 3. Convincing qualitative behaviors. Emergent patterns—migratory resource tr
1. Lack of quantitative behavioral metrics. Claims about “emergence” are supported mainly by qualitative observation. There are no explicit metrics for behavioral diversity, complexity, or ecological stability. 2. Minimal evolutionary mechanism. The use of pure mutation without crossover or explicit selection limits interpretability of the evolutionary dynamics. It’s unclear whether complexity arises from environment pressure alone or random drift. 3. Missing ablations and baselines. The paper
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolution and Genetic Dynamics · Evolutionary Game Theory and Cooperation
