LLMs as a synthesis between symbolic and distributed approaches to language
Gemma Boleda

TL;DR
This paper argues that large language models (LLMs) synthesize symbolic and distributed approaches to language, encoding both types of representations and behaviors, which explains their success and offers new insights into language cognition.
Contribution
It demonstrates that deep learning models for language integrate symbolic and distributed representations, bridging two historically opposed approaches.
Findings
LLMs encode morphosyntactic knowledge in a near-discrete fashion.
Models flexibly switch between symbolic and distributed modes.
This synthesis may explain the success of LLMs in language tasks.
Abstract
Since the middle of the 20th century, a fierce battle is being fought between symbolic and distributed approaches to language and cognition. The success of deep learning models, and LLMs in particular, has been alternatively taken as showing that the distributed camp has won, or dismissed as an irrelevant engineering development. In this position paper, I argue that deep learning models for language actually represent a synthesis between the two traditions. This is because 1) deep learning architectures allow for both distributed/continuous/fuzzy and symbolic/discrete/categorical-like representations and processing; 2) models trained on language make use of this flexibility. In particular, I review recent research in interpretability that showcases how a substantial part of morphosyntactic knowledge is encoded in a near-discrete fashion in LLMs. This line of research suggests that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques
MethodsFeature Information Entropy Regularized Cross Entropy
