Rule Synergy Analysis using LLMs: State of the Art and Implications

Bahar Bateni; Benjamin Pratt; and Jim Whitehead

arXiv:2508.19484·cs.CL·August 28, 2025

Rule Synergy Analysis using LLMs: State of the Art and Implications

Bahar Bateni, Benjamin Pratt, and Jim Whitehead

PDF

TL;DR

This paper evaluates large language models' ability to understand complex rule interactions in dynamic environments like card games, revealing strengths in identifying non-synergies but challenges with positive and negative rule interactions.

Contribution

Introduces a new dataset of card synergies from Slay the Spire and analyzes LLMs' performance in understanding rule interactions in this context.

Findings

01

LLMs excel at identifying non-synergistic pairs

02

LLMs struggle with detecting positive and negative synergies

03

Common errors include timing issues and rule comprehension

Abstract

Large language models (LLMs) have demonstrated strong performance across a variety of domains, including logical reasoning, mathematics, and more. In this paper, we investigate how well LLMs understand and reason about complex rule interactions in dynamic environments, such as card games. We introduce a dataset of card synergies from the game Slay the Spire, where pairs of cards are classified based on their positive, negative, or neutral interactions. Our evaluation shows that while LLMs excel at identifying non-synergistic pairs, they struggle with detecting positive and, particularly, negative synergies. We categorize common error types, including issues with timing, defining game states, and following game rules. Our findings suggest directions for future research to improve model performance in predicting the effect of rules and their interactions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.