BERTology Meets Biology: Interpreting Attention in Protein Language   Models

Jesse Vig; Ali Madani; Lav R. Varshney; Caiming Xiong; Richard Socher,; Nazneen Fatema Rajani

arXiv:2006.15222·cs.CL·March 30, 2021

BERTology Meets Biology: Interpreting Attention in Protein Language Models

Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher,, Nazneen Fatema Rajani

PDF

2 Repos 3 Videos

TL;DR

This paper explores how attention mechanisms in Transformer models for proteins reveal structural and functional insights, such as folding, binding sites, and biophysical properties, across multiple architectures and datasets.

Contribution

It introduces methods to interpret attention in protein Transformers, linking attention patterns to protein structure and function, with visualizations and cross-architecture validation.

Findings

01

Attention captures protein folding structure.

02

Attention targets binding sites.

03

Attention focuses on biophysical properties with depth.

Abstract

Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. In this work, we demonstrate a set of methods for analyzing protein Transformer models through the lens of attention. We show that attention: (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We find this behavior to be consistent across three Transformer architectures (BERT, ALBERT, XLNet) and two distinct protein datasets. We also present a three-dimensional visualization of the interaction between attention and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

BERTology Meets Biology: Interpreting Attention in Protein Language Models (Paper Explained)· youtube

BERTology meets Biology | Solving biological problems with Transformers· youtube

BERTology Meets Biology: Interpreting Attention in Protein Language Models· slideslive

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · LAMB · ALBERT · Residual Connection · Label Smoothing · Multi-Head Attention