Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization
Min Zhang, Jianfeng He, Shuo Lei, Murong Yue, Linhang Wang, Chang-Tien, Lu

TL;DR
This paper investigates the limitations of large language models in compositional generalization and introduces a human-guided tool manipulation framework that significantly improves performance on challenging benchmarks.
Contribution
It proposes a novel human-guided tool manipulation framework (HTM) that enhances LLMs' compositional generalization by better tool creation and integration, outperforming existing methods.
Findings
HTM achieves state-of-the-art results on two benchmarks.
Outperforms existing methods by 70% on the most challenging test split.
Prevailing ICL methods struggle with complex compositional questions.
Abstract
The meaning of complex phrases in natural language is composed of their individual components. The task of compositional generalization evaluates a model's ability to understand new combinations of components. Previous studies trained smaller, task-specific models, which exhibited poor generalization. While large language models (LLMs) exhibit impressive generalization abilities on many tasks through in-context learning (ICL), their potential for compositional generalization remains unexplored. In this paper, we first empirically investigate prevailing ICL methods in compositional generalization. We find that they struggle with complex compositional questions due to cumulative errors in long reasoning steps and intricate logic required for tool-making. Consequently, we propose a human-guided tool manipulation framework (HTM) that generates tools for sub-questions and integrates multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
