Reasoning in Large Language Models: A Geometric Perspective

Romain Cosentino; Sarath Shekkizhar

arXiv:2407.02678·cs.AI·July 4, 2024

Reasoning in Large Language Models: A Geometric Perspective

Romain Cosentino, Sarath Shekkizhar

PDF

Open Access

TL;DR

This paper investigates the reasoning abilities of large language models from a geometric perspective, linking their expressive power to the density of self-attention graphs and intrinsic input dimensions, supported by theoretical and empirical analysis.

Contribution

It introduces a novel geometric framework connecting LLM reasoning to self-attention graph density and intrinsic input dimensions, offering new insights into model expressiveness.

Findings

01

Higher graph density correlates with increased expressive capacity.

02

Intrinsic input dimension influences the reasoning ability of LLMs.

03

Empirical evidence supports the geometric framework's relevance to recent reasoning improvements.

Abstract

The advancement of large language models (LLMs) for real-world applications hinges critically on enhancing their reasoning capabilities. In this work, we explore the reasoning abilities of large language models (LLMs) through their geometrical understanding. We establish a connection between the expressive power of LLMs and the density of their self-attention graphs. Our analysis demonstrates that the density of these graphs defines the intrinsic dimension of the inputs to the MLP blocks. We demonstrate through theoretical analysis and toy examples that a higher intrinsic dimension implies a greater expressive capacity of the LLM. We further provide empirical evidence linking this geometric framework to recent advancements in methods aimed at enhancing the reasoning capabilities of LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies