Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models
Jiaxiang Liu, Boxuan Xing, Chenhao Yuan, Chenxiang Zhang, Di Wu, Xiusheng Huang, Haida Yu, Chuhan Lang, Pengfei Cao, Jun Zhao, Kang Liu

TL;DR
Know-MRI is an open-source framework that systematically analyzes and interprets the internal knowledge mechanisms of large language models, supporting multiple interpretation methods and input formats for comprehensive diagnostics.
Contribution
It introduces an extensible core module that automatically matches inputs with interpretation methods, enabling flexible and comprehensive analysis of LLMs' knowledge mechanisms.
Findings
Supports multiple interpretation methods and input formats
Facilitates comprehensive diagnosis of LLMs' internal knowledge
Open-source implementation available for community use
Abstract
As large language models (LLMs) continue to advance, there is a growing urgency to enhance the interpretability of their internal knowledge mechanisms. Consequently, many interpretation methods have emerged, aiming to unravel the knowledge mechanisms of LLMs from various perspectives. However, current interpretation methods differ in input data formats and interpreting outputs. The tools integrating these methods are only capable of supporting tasks with specific inputs, significantly constraining their practical applications. To address these challenges, we present an open-source Knowledge Mechanisms Revealer&Interpreter (Know-MRI) designed to analyze the knowledge mechanisms within LLMs systematically. Specifically, we have developed an extensible core module that can automatically match different input data with interpretation methods and consolidate the interpreting outputs. It…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Topic Modeling
