Reasoning in Large Language Models: A Geometric Perspective
Romain Cosentino, Sarath Shekkizhar

TL;DR
This paper investigates the reasoning abilities of large language models from a geometric perspective, linking their expressive power to the density of self-attention graphs and intrinsic input dimensions, supported by theoretical and empirical analysis.
Contribution
It introduces a novel geometric framework connecting LLM reasoning to self-attention graph density and intrinsic input dimensions, offering new insights into model expressiveness.
Findings
Higher graph density correlates with increased expressive capacity.
Intrinsic input dimension influences the reasoning ability of LLMs.
Empirical evidence supports the geometric framework's relevance to recent reasoning improvements.
Abstract
The advancement of large language models (LLMs) for real-world applications hinges critically on enhancing their reasoning capabilities. In this work, we explore the reasoning abilities of large language models (LLMs) through their geometrical understanding. We establish a connection between the expressive power of LLMs and the density of their self-attention graphs. Our analysis demonstrates that the density of these graphs defines the intrinsic dimension of the inputs to the MLP blocks. We demonstrate through theoretical analysis and toy examples that a higher intrinsic dimension implies a greater expressive capacity of the LLM. We further provide empirical evidence linking this geometric framework to recent advancements in methods aimed at enhancing the reasoning capabilities of LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies
