Identification of Knowledge Neurons in Protein Language Models
Divya Nori, Shivali Singireddy, Marina Ten Have

TL;DR
This paper improves the interpretability of protein language models by identifying and analyzing knowledge neurons, revealing their role in understanding enzyme sequence motifs and enhancing model transparency.
Contribution
It introduces methods to identify knowledge neurons in protein language models, demonstrating their significance in understanding enzyme motifs and outperforming baseline selection techniques.
Findings
Knowledge neurons are densely located in key vector prediction networks.
Activation and gradient-based methods outperform random baseline.
Knowledge neurons capture enzyme sequence motif information.
Abstract
Neural language models have become powerful tools for learning complex representations of entities in natural language processing tasks. However, their interpretability remains a significant challenge, particularly in domains like computational biology where trust in model predictions is crucial. In this work, we aim to enhance the interpretability of protein language models, specifically the state-of-the-art ESM model, by identifying and characterizing knowledge neurons - components that express understanding of key information. After fine-tuning the ESM model for the task of enzyme sequence classification, we compare two knowledge neuron selection methods that preserve a subset of neurons from the original model. The two methods, activation-based and integrated gradient-based selection, consistently outperform a random baseline. In particular, these methods show that there is a high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Machine Learning in Materials Science · Topic Modeling
