Mortar: Evolving Mechanics for Automatic Game Design
Muhammad U. Nasir, Yuchen Li, Steven James, Julian Togelius

TL;DR
Mortar is an innovative system that autonomously evolves diverse and playable game mechanics using a combination of quality-diversity algorithms and large language models, evaluated through skill-based gameplay testing.
Contribution
It introduces a novel approach combining AI techniques to automatically generate and evaluate game mechanics for automatic game design.
Findings
Mortar produces diverse, playable games with mechanics that enhance skill-based ordering.
Mechanics evolved by Mortar contribute significantly to maintaining skill progression.
Ablation studies confirm the importance of each system component in Mortar's performance.
Abstract
We present Mortar, a system for autonomously evolving game mechanics for automatic game design. Game mechanics define the rules and interactions that govern gameplay, and designing them manually is a time-consuming and expert-driven process. Mortar combines a quality-diversity algorithm with a large language model to explore a diverse set of mechanics, which are evaluated by synthesising complete games that incorporate both evolved mechanics and those drawn from an archive. The mechanics are evaluated by composing complete games through a tree search procedure, where the resulting games are evaluated by their ability to preserve a skill-based ordering over players -- that is, whether stronger players consistently outperform weaker ones. We assess the mechanics based on their contribution towards the skill-based ordering score in the game. We demonstrate that Mortar produces games that…
Peer Reviews
Decision·Submitted to ICLR 2026
- Interesting and original focus on mechanic generation rather than level or asset generation, which is underexplored in procedural content generation. - Creative use of LLM-guided mutation and code evolution within a structured search framework. - Conceptually elegant link between “skill gradient” and perceived game depth. - The paper is readable and technically ambitious, combining ideas from QD search, MCTS evaluation, and code-based generative design.
- Not sure if it is that relevant for ICLR, which is broadly about learning representations - The claim that “a game’s quality is revealed through a consistent skill gradient” seems too strong. Many successful games (e.g., Animal Crossing, Cards Against Humanity, The Sims) are not skill-based yet still compelling. - The user study (N=10) is rather small - It’s unclear how MORTAR compares to simpler or ablated variants without LLM-driven mutations. The relative importance of components could be
The originality of the work primarily comes from generating games with python code, thus allowing for a larger search space than prior work in the area. This is reasonable originality as python certainly represents a larger search space than PuzzleScript or Ludii. The quality of the work is more mixed. The system is described well and the implementation follows nicely from prior work, particularly the reliance on Nielsen et al.'s evaluation approach. The novelty of trying to identify what mechan
The current draft of this paper has a number of weaknesses that could be improved. First, the paper makes an odd claim around there being "comparatively little attention" paid to generating mechanics. This does not follow from the literature, or even from the related work section in this paper. More broadly there is an issue in the introduction of motivation. It's not clear why we need MORTAR, given the large amount of existing game generation approaches. Second, and most importantly, the cu
Game generation is a challenging problem but has not gained enough attention. This paper demonstrates an attempt towards generating complete games purely through AI. It generates both game mechanics and level layouts simultaneously. While it does not create assets such as tile images, modern computer vision techniques should make generating such assets relatively straightforward. The methodology of using an LLM as an evolutionary operator is practical. The system is implemented based on Python r
The paper does not make its key contributions and technical content sufficiently clear. I would recommend that the authors explicitly summarize their contributions and novelty beyond closely related papers (e.g., Gavel (Todd et al., 2024)) in the introduction, even though I personally understand such contributions and novelty. It would be appreciated if the authors could provide some visualizations of how the LLM operates on the game in the appendix. Meanwhile, the process of level layout gener
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Digital Games and Media · Educational Games and Gamification
