DeepDecipher: Accessing and Investigating Neuron Activation in Large   Language Models

Albert Garde; Esben Kran; Fazl Barez

arXiv:2310.01870·cs.LG·November 30, 2023·2 cites

DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models

Albert Garde, Esben Kran, Fazl Barez

PDF

Open Access 1 Repo

TL;DR

DeepDecipher is a user-friendly API and interface that facilitates the analysis and interpretation of neuron activations in large language models, enhancing transparency and understanding of model internals.

Contribution

This work introduces DeepDecipher, a novel tool that makes advanced interpretability techniques accessible and scalable for analyzing transformer-based LLMs.

Findings

01

Enables efficient neuron analysis in large models

02

Allows comparison of different models' internal behaviors

03

Improves transparency and trustworthiness of LLMs

Abstract

As large language models (LLMs) become more capable, there is an urgent need for interpretable and transparent tools. Current methods are difficult to implement, and accessible tools to analyze model internals are lacking. To bridge this gap, we present DeepDecipher - an API and interface for probing neurons in transformer models' MLP layers. DeepDecipher makes the outputs of advanced interpretability techniques for LLMs readily available. The easy-to-use interface also makes inspecting these complex models more intuitive. This paper outlines DeepDecipher's design and capabilities. We demonstrate how to analyze neurons, compare models, and gain insights into model behavior. For example, we contrast DeepDecipher's functionality with similar tools like Neuroscope and OpenAI's Neuron Explainer. DeepDecipher enables efficient, scalable analysis of LLMs. By granting access to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

apartresearch/deepdecipher
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Ferroelectric and Negative Capacitance Devices