Automated Neuron Labelling Enables Generative Steering and Interpretability in Protein Language Models
Arjun Banerjee, David Martinez, Camille Dang, Ethan Tam

TL;DR
This paper introduces an automated neuron labeling framework for protein language models, enabling interpretability and targeted protein generation based on neuron activations, revealing insights into model structure and biological properties.
Contribution
The authors present the first scalable method for labeling neurons in PLMs with biological descriptions and develop a neuron activation-guided steering technique for protein design.
Findings
Neurons are selectively sensitive to biochemical and structural properties.
The steering method can generate proteins with specific traits.
Analysis reveals scaling laws and structured neuron space in PLMs.
Abstract
Protein language models (PLMs) encode rich biological information, yet their internal neuron representations are poorly understood. We introduce the first automated framework for labeling every neuron in a PLM with biologically grounded natural language descriptions. Unlike prior approaches relying on sparse autoencoders or manual annotation, our method scales to hundreds of thousands of neurons, revealing individual neurons are selectively sensitive to diverse biochemical and structural properties. We then develop a novel neuron activation-guided steering method to generate proteins with desired traits, enabling convergence to target biochemical properties like molecular weight and instability index as well as secondary and tertiary structural motifs, including alpha helices and canonical Zinc Fingers. We finally show that analysis of labeled neurons in different model sizes reveals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Machine Learning in Bioinformatics · Genomics and Rare Diseases
