diagNNose: A Library for Neural Activation Analysis

Jaap Jumelet

arXiv:2011.06819·cs.CL·November 16, 2020

diagNNose: A Library for Neural Activation Analysis

Jaap Jumelet

PDF

Open Access 1 Repo

TL;DR

diagNNose is an open-source library that offers various interpretability tools for analyzing neural network activations, demonstrated through a case study on language models' subject-verb agreement.

Contribution

The paper introduces diagNNose, a comprehensive library for neural activation analysis, providing new tools for interpretability in deep learning models.

Findings

01

Effective analysis of language model activations

02

Insights into subject-verb agreement mechanisms

03

Open source availability for community use

Abstract

In this paper we introduce diagNNose, an open source library for analysing the activations of deep neural networks. diagNNose contains a wide array of interpretability techniques that provide fundamental insights into the inner workings of neural networks. We demonstrate the functionality of diagNNose with a case study on subject-verb agreement within language models. diagNNose is available at https://github.com/i-machine-think/diagnnose.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

i-machine-think/diagnnose
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsInterpretability