Revealing the structure of language model capabilities

Ryan Burnell; Han Hao; Andrew R. A. Conway; and Jose Hernandez Orallo

arXiv:2306.10062·cs.CL·June 21, 2023·6 cites

Revealing the structure of language model capabilities

Ryan Burnell, Han Hao, Andrew R. A. Conway, and Jose Hernandez Orallo

PDF

Open Access 1 Repo

TL;DR

This paper uncovers a three-factor structure—reasoning, comprehension, and core language modeling—that explains the capabilities of large language models, revealing their multifaceted nature and relationships to model properties.

Contribution

It introduces a novel factor analysis approach to identify and characterize the latent capabilities of LLMs, providing a clearer understanding of their structure.

Findings

01

LLM capabilities are best explained by three factors.

02

These factors account for most performance variance.

03

Different abilities relate differently to model size and tuning.

Abstract

Building a theoretical understanding of the capabilities of large language models (LLMs) is vital for our ability to predict and explain the behavior of these systems. Here, we investigate the structure of LLM capabilities by extracting latent capabilities from patterns of individual differences across a varied population of LLMs. Using a combination of Bayesian and frequentist factor analysis, we analyzed data from 29 different LLMs across 27 cognitive tasks. We found evidence that LLM capabilities are not monolithic. Instead, they are better explained by three well-delineated factors that represent reasoning, comprehension and core language modeling. Moreover, we found that these three factors can explain a high proportion of the variance in model performance. These results reveal a consistent structure in the capabilities of different LLMs and demonstrate the multifaceted nature of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ryanburnell/revealing-llm-capabilities
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Natural Language Processing Techniques