Causal Language Modeling Can Elicit Search and Reasoning Capabilities on   Logic Puzzles

Kulin Shah; Nishanth Dikkala; Xin Wang; Rina Panigrahy

arXiv:2409.10502·cs.LG·September 17, 2024

Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles

Kulin Shah, Nishanth Dikkala, Xin Wang, Rina Panigrahy

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that causal language models, specifically Transformers trained on logical sequences, can learn to solve complex puzzles like Sudoku and Zebra puzzles, revealing their reasoning capabilities.

Contribution

It shows that training Transformers on logical step sequences enables them to solve complex puzzles, highlighting their emergent reasoning abilities.

Findings

01

Transformer models solve 94.21% of Sudoku puzzles correctly.

02

Models solve 92.04% of Zebra puzzles accurately.

03

Internal representations encode possible cell values, indicating reasoning.

Abstract

Causal language modeling using the Transformer architecture has yielded remarkable capabilities in Large Language Models (LLMs) over the last few years. However, the extent to which fundamental search and reasoning capabilities emerged within LLMs remains a topic of ongoing debate. In this work, we study if causal language modeling can learn a complex task such as solving Sudoku puzzles. To solve a Sudoku, the model is first required to search over all empty cells of the puzzle to decide on a cell to fill and then apply an appropriate strategy to fill the decided cell. Sometimes, the application of a strategy only results in thinning down the possible values in a cell rather than concluding the exact value of the cell. In such cases, multiple strategies are applied one after the other to fill a single cell. We observe that Transformer models trained on this synthetic task can indeed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kulinshah98/llm-reasoning-logic-puzzles
jaxOfficial

Videos

Causal language modeling can elicit search and reasoning capabilities on logic puzzles· slideslive

Taxonomy

TopicsLogic, Reasoning, and Knowledge · Semantic Web and Ontologies · Bayesian Modeling and Causal Inference

MethodsAttention Is All You Need · Sparse Evolutionary Training · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Layer Normalization · Dropout · Position-Wise Feed-Forward Layer · Residual Connection