Do Large Language Models Have Compositional Ability? An Investigation   into Limitations and Scalability

Zhuoyan Xu; Zhenmei Shi; Yingyu Liang

arXiv:2407.15720·cs.CL·August 13, 2024·1 cites

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability

Zhuoyan Xu, Zhenmei Shi, Yingyu Liang

PDF

Open Access 1 Repo

TL;DR

This paper investigates the compositional reasoning abilities of large language models, revealing that they perform well on simple, segment-specific tasks and that scaling improves these abilities, but struggle with multi-step reasoning tasks.

Contribution

The study provides empirical analysis and theoretical insights into LLMs' compositional capabilities, highlighting their limitations and the impact of model scale on complex task performance.

Findings

01

Models perform well on simple composite tasks with distinct segments

02

Scaling improves performance on simple tasks

03

Models underperform on multi-step reasoning tasks, with scaling offering limited benefits

Abstract

Large language models (LLMs) have emerged as powerful tools for many AI problems and exhibit remarkable in-context learning (ICL) capabilities. Compositional ability, solving unseen complex tasks that combine two or more simple tasks, is an essential reasoning ability for Artificial General Intelligence. Despite the tremendous success of LLMs, how they approach composite tasks, especially those not encountered during the pretraining phase, remains an open and largely underexplored question. In this study, we delve into the ICL capabilities of LLMs on composite tasks, with only simple tasks as in-context examples. We develop a test suite of composite tasks including linguistic and logical challenges and perform empirical studies across different LLM families. We observe that models exhibit divergent behaviors: (1) For simpler composite tasks that apply distinct mapping mechanisms to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oliverxuzy/llm_compose
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques