Discovering Decoupled Functional Modules in Large Language Models
Yanke Yu, Jin Li, Ying Sun, Ping Li, Zhefeng Wang, Yi Zheng

TL;DR
This paper introduces a novel unsupervised framework for discovering and interpreting functional modules within large language models, enhancing understanding of their internal organization and semantic specialization.
Contribution
It proposes the ULCMOD framework with a new objective and iterative algorithm to disentangle neurons into meaningful modules, advancing interpretability of LLMs.
Findings
Discovered high-quality, disentangled modules capturing semantic information
Modules show semantic coherence and interpretability
Improved downstream task performance
Abstract
Understanding the internal functional organization of Large Language Models (LLMs) is crucial for improving their trustworthiness and performance. However, how LLMs organize different functions into modules remains highly unexplored. To bridge this gap, we formulate a functional module discovery problem and propose an Unsupervised LLM Cross-layer MOdule Discovery (ULCMOD) framework that simultaneously disentangles the large set of neurons in the entire LLM into modules while discovering the topics of input samples related to these modules. Our framework introduces a novel objective function and an efficient Iterative Decoupling (IterD) algorithm. Extensive experiments show that our method discovers high-quality, disentangled modules that capture more meaningful semantic information and achieve superior performance in various downstream tasks. Moreover, our qualitative analysis reveals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Advanced Graph Neural Networks
