EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation
Mansoor Ahmed, Sujin Lee, Umar Khayaz, Murray Patterson

TL;DR
EvoStruct introduces a novel method combining a frozen protein language model with structural context to improve antibody CDR design, significantly enhancing amino acid recovery and diversity.
Contribution
EvoStruct bridges a frozen protein language model with 3D structural data using cross-attention, addressing vocabulary collapse in antibody CDR design.
Findings
Achieves 16% higher amino acid recovery than GNN baselines.
Reduces perplexity by 43% compared to prior methods.
Recovers 2.3 times greater amino acid diversity.
Abstract
Equivariant graph neural network (GNN) methods for antibody complementarity-determining region (CDR) design achieve the highest sequence recovery but suffer from severe vocabulary collapse. The current best GNN methods over-predict very few amino acids, such as tyrosine and glycine, while ignoring functionally important residues. We trace this failure to GNN encoders learning amino acid distributions de novo from limited structural data, discarding substitution patterns encoded in evolutionary databases. To resolve this, we propose EvoStruct, which bridges a frozen protein language model (PLM) with 3D structural context from an E(3)-equivariant GNN via a cross-attention adapter. Unlike prior PLM-structure adapters for general protein design, EvoStruct targets the vocabulary collapse problem specific to CDR design through progressive PLM unfreezing and R-Drop consistency regularization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
