Investigating Symbolic Capabilities of Large Language Models

Neisarg Dave; Daniel Kifer; C. Lee Giles; Ankur Mali

arXiv:2405.13209·cs.CL·May 24, 2024

Investigating Symbolic Capabilities of Large Language Models

Neisarg Dave, Daniel Kifer, C. Lee Giles, Ankur Mali

PDF

Open Access

TL;DR

This paper evaluates large language models' abilities to perform symbolic reasoning tasks, revealing significant performance declines with increased complexity and highlighting the need for specialized training and model adjustments.

Contribution

It provides a comprehensive evaluation of LLMs on symbolic tasks using Chomsky's Hierarchy, an area previously underexplored in LLM research.

Findings

01

Performance declines as symbolic complexity increases

02

Fine-tuned GPT-3.5 shows limited improvement

03

Models have limited generalization on symbolic tasks

Abstract

Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to bridge this gap by rigorously evaluating LLMs on a series of symbolic tasks, such as addition, multiplication, modulus arithmetic, numerical precision, and symbolic counting. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The assessment framework is anchored in Chomsky's Hierarchy, providing a robust measure of the computational abilities of these models. The evaluation employs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques