Automated Neuron Labelling Enables Generative Steering and Interpretability in Protein Language Models

Arjun Banerjee; David Martinez; Camille Dang; Ethan Tam

arXiv:2507.06458·cs.LG·July 10, 2025

Automated Neuron Labelling Enables Generative Steering and Interpretability in Protein Language Models

Arjun Banerjee, David Martinez, Camille Dang, Ethan Tam

PDF

Open Access 1 Repo

TL;DR

This paper introduces an automated neuron labeling framework for protein language models, enabling interpretability and targeted protein generation based on neuron activations, revealing insights into model structure and biological properties.

Contribution

The authors present the first scalable method for labeling neurons in PLMs with biological descriptions and develop a neuron activation-guided steering technique for protein design.

Findings

01

Neurons are selectively sensitive to biochemical and structural properties.

02

The steering method can generate proteins with specific traits.

03

Analysis reveals scaling laws and structured neuron space in PLMs.

Abstract

Protein language models (PLMs) encode rich biological information, yet their internal neuron representations are poorly understood. We introduce the first automated framework for labeling every neuron in a PLM with biologically grounded natural language descriptions. Unlike prior approaches relying on sparse autoencoders or manual annotation, our method scales to hundreds of thousands of neurons, revealing individual neurons are selectively sensitive to diverse biochemical and structural properties. We then develop a novel neuron activation-guided steering method to generate proteins with desired traits, enabling convergence to target biochemical properties like molecular weight and instability index as well as secondary and tertiary structural motifs, including alpha helices and canonical Zinc Fingers. We finally show that analysis of labeled neurons in different model sizes reveals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arjun-banerjee/plmneuron
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Machine Learning in Bioinformatics · Genomics and Rare Diseases