AILA--First Experiments with Localist Language Models
Joachim Diederich

TL;DR
This paper introduces a controllable localist transformer language model that allows dynamic adjustment of representation locality, balancing interpretability and performance without retraining, demonstrated through experiments on WikiText.
Contribution
It presents the first empirical framework for tunable locality in transformer language models, enabling explicit control over interpretability and efficiency tradeoffs.
Findings
Localist configurations reduce attention entropy significantly.
Intermediate locality values optimize interpretability and performance.
Localist models maintain high accuracy with increased interpretability.
Abstract
This paper presents the first empirical demonstration of controllable locality in transformer language models, a novel architectural framework that enables continuous control over the degree of representation localization through a tunable locality dial parameter. Unlike traditional language models that rely exclusively on distributed representations, our approach allows dynamic interpolation between highly interpretable localist encodings and efficient distributed representations without requiring model retraining. We conducted experiments on the WikiText corpus using a two-layer transformer architecture, systematically varying the locality parameter {\lambda} across the full spectrum from 1.0 (fully localist) to 0.0 (fully distributed). Our results demonstrate that localist configurations achieve dramatically lower attention entropy, with {\lambda} = 1.0 yielding 5.36 bits compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Explainable Artificial Intelligence (XAI) · Topic Modeling
