Agents of Change: Self-Evolving LLM Agents for Strategic Planning
Nikolas Belle, Dakota Barnes, Alfonso Amayuelas, Ivan Bercovich, Xin Eric Wang, and William Wang

TL;DR
This paper introduces HexMachina, a continual learning system for LLM agents that improves strategic planning in complex, adversarial environments like Settlers of Catan, outperforming traditional prompt-based methods.
Contribution
HexMachina separates environment discovery from strategy improvement, enabling LLMs to evolve stable, high-level strategies through artifact-centric continual learning.
Findings
HexMachina achieves a 54% win rate in Catan experiments.
It outperforms prompt-driven and no-discovery baselines.
Isolating strategy learning enhances performance.
Abstract
We address the long-horizon gap in large language model (LLM) agents by enabling them to sustain coherent strategies in adversarial, stochastic environments. Settlers of Catan provides a challenging benchmark: success depends on balancing short- and long-term goals amid randomness, trading, expansion, and blocking. Prompt-centric LLM agents (e.g., ReAct, Reflexion) must re-interpret large, evolving game states each turn, quickly saturating context windows and losing strategic consistency. We propose HexMachina, a continual learning multi-agent system that separates environment discovery (inducing an adapter layer without documentation) from strategy improvement (evolving a compiled player through code refinement and simulation). This design preserves executable artifacts, allowing the LLM to focus on high-level strategy rather than per-turn reasoning. In controlled Catanatron…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Topic Modeling
