Mortar: Evolving Mechanics for Automatic Game Design

Muhammad U. Nasir; Yuchen Li; Steven James; Julian Togelius

arXiv:2601.00105·cs.AI·January 5, 2026

Mortar: Evolving Mechanics for Automatic Game Design

Muhammad U. Nasir, Yuchen Li, Steven James, Julian Togelius

PDF

Open Access 3 Reviews

TL;DR

Mortar is an innovative system that autonomously evolves diverse and playable game mechanics using a combination of quality-diversity algorithms and large language models, evaluated through skill-based gameplay testing.

Contribution

It introduces a novel approach combining AI techniques to automatically generate and evaluate game mechanics for automatic game design.

Findings

01

Mortar produces diverse, playable games with mechanics that enhance skill-based ordering.

02

Mechanics evolved by Mortar contribute significantly to maintaining skill progression.

03

Ablation studies confirm the importance of each system component in Mortar's performance.

Abstract

We present Mortar, a system for autonomously evolving game mechanics for automatic game design. Game mechanics define the rules and interactions that govern gameplay, and designing them manually is a time-consuming and expert-driven process. Mortar combines a quality-diversity algorithm with a large language model to explore a diverse set of mechanics, which are evaluated by synthesising complete games that incorporate both evolved mechanics and those drawn from an archive. The mechanics are evaluated by composing complete games through a tree search procedure, where the resulting games are evaluated by their ability to preserve a skill-based ordering over players -- that is, whether stronger players consistently outperform weaker ones. We assess the mechanics based on their contribution towards the skill-based ordering score in the game. We demonstrate that Mortar produces games that…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 4

Strengths

- Interesting and original focus on mechanic generation rather than level or asset generation, which is underexplored in procedural content generation. - Creative use of LLM-guided mutation and code evolution within a structured search framework. - Conceptually elegant link between “skill gradient” and perceived game depth. - The paper is readable and technically ambitious, combining ideas from QD search, MCTS evaluation, and code-based generative design.

Weaknesses

- Not sure if it is that relevant for ICLR, which is broadly about learning representations - The claim that “a game’s quality is revealed through a consistent skill gradient” seems too strong. Many successful games (e.g., Animal Crossing, Cards Against Humanity, The Sims) are not skill-based yet still compelling. - The user study (N=10) is rather small - It’s unclear how MORTAR compares to simpler or ablated variants without LLM-driven mutations. The relative importance of components could be

Reviewer 02Rating 2Confidence 4

Strengths

The originality of the work primarily comes from generating games with python code, thus allowing for a larger search space than prior work in the area. This is reasonable originality as python certainly represents a larger search space than PuzzleScript or Ludii. The quality of the work is more mixed. The system is described well and the implementation follows nicely from prior work, particularly the reliance on Nielsen et al.'s evaluation approach. The novelty of trying to identify what mechan

Weaknesses

The current draft of this paper has a number of weaknesses that could be improved. First, the paper makes an odd claim around there being "comparatively little attention" paid to generating mechanics. This does not follow from the literature, or even from the related work section in this paper. More broadly there is an issue in the introduction of motivation. It's not clear why we need MORTAR, given the large amount of existing game generation approaches. Second, and most importantly, the cu

Reviewer 03Rating 6Confidence 4

Strengths

Game generation is a challenging problem but has not gained enough attention. This paper demonstrates an attempt towards generating complete games purely through AI. It generates both game mechanics and level layouts simultaneously. While it does not create assets such as tile images, modern computer vision techniques should make generating such assets relatively straightforward. The methodology of using an LLM as an evolutionary operator is practical. The system is implemented based on Python r

Weaknesses

The paper does not make its key contributions and technical content sufficiently clear. I would recommend that the authors explicitly summarize their contributions and novelty beyond closely related papers (e.g., Gavel (Todd et al., 2024)) in the introduction, even though I personally understand such contributions and novelty. It would be appreciated if the authors could provide some visualizations of how the LLM operates on the game in the appendix. Meanwhile, the process of level layout gener

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Digital Games and Media · Educational Games and Gamification