Solving Zebra Puzzles Using Constraint-Guided Multi-Agent Systems
Shmuel Berman, Kathleen McKeown, Baishakhi Ray

TL;DR
This paper presents ZPS, a multi-agent system combining LLMs and theorem proving to effectively solve complex Zebra puzzles, outperforming standalone LLMs significantly.
Contribution
The introduction of ZPS, integrating LLMs with SMT solvers and feedback mechanisms, to improve solving complex logic puzzles like Zebra puzzles.
Findings
GPT-4 achieved 166% improvement in correct solutions
Automated grid puzzle grader proved reliable in user-study
ZPS outperformed individual LLMs in puzzle solving
Abstract
Prior research has enhanced the ability of Large Language Models (LLMs) to solve logic puzzles using techniques such as chain-of-thought prompting or introducing a symbolic representation. These frameworks are still usually insufficient to solve complicated logical problems, such as Zebra puzzles, due to the inherent complexity of translating natural language clues into logical statements. We introduce a multi-agent system, ZPS, that integrates LLMs with an off the shelf theorem prover. This system tackles the complex puzzle-solving task by breaking down the problem into smaller, manageable parts, generating SMT (Satisfiability Modulo Theories) code to solve them with a theorem prover, and using feedback between the agents to repeatedly improve their answers. We also introduce an automated grid puzzle grader to assess the correctness of our puzzle solutions and show that the automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Semantic Web and Ontologies · Rough Sets and Fuzzy Logic
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Dropout
