Faith and Fate: Limits of Transformers on Compositionality
Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang,, Bill Yuchen Lin, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena D., Hwang, Soumya Sanyal, Sean Welleck, Xiang Ren, Allyson Ettinger, Zaid, Harchaoui, Yejin Choi

TL;DR
This paper investigates the limitations of transformer large language models in solving complex compositional tasks, revealing they rely on subgraph matching rather than systematic reasoning, with performance degrading as task complexity increases.
Contribution
The study introduces a systematic framework to analyze transformer LLMs on compositional tasks and provides theoretical insights into their reasoning limitations.
Findings
Transformers solve tasks via subgraph matching, not systematic reasoning.
Performance declines rapidly with increased task complexity.
Empirical and theoretical analysis highlight fundamental limitations.
Abstract
Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify transformer LLMs, we investigate the limits of these models across three representative compositional tasks -- multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that transformer LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Machine Learning in Materials Science · Natural Language Processing Techniques
