Localist LLMs with Recruitment Learning
Joachim Diederich

TL;DR
This paper introduces a flexible training framework for large language models that allows dynamic adjustment between interpretable localist and efficient distributed representations, with adaptive capacity allocation and theoretical guarantees.
Contribution
The paper proposes a novel, tunable framework with a locality dial and recruitment mechanism for adaptive, multi-granularity capacity allocation in LLMs, supported by rigorous theoretical analysis.
Findings
The framework enables continuous interpolation between localist and distributed representations.
The recruitment mechanism adaptively allocates semantic blocks without full domain knowledge.
Theoretical results establish conditions for attention focus and convergence guarantees.
Abstract
We present a novel framework for training large language models with continuously adjustable internal representations that span the full spectrum from localist (interpretable, rule-based) to distributed (generalizable, efficient) encodings. The key innovations are (1) a locality dial, a tunable parameter that dynamically controls the degree of localization during both training and inference without requiring model retraining, (2) an information-theoretic recruitment mechanism that adaptively allocates semantic blocks as needed, eliminating the requirement for complete domain knowledge at initialization, and (3) a hierarchical recruitment framework that extends capacity allocation to entire specialized LLMs, enabling multi-granularity architectural adaptation. This is achieved through group sparsity penalties on attention mechanisms, information-theoretic anchor design, dynamic rule…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
