Biologically-Grounded Multi-Encoder Architectures as Developability Oracles for Antibody Design
Simon J. Crouzet

TL;DR
This paper introduces CrossAbSense, a neural oracle framework using protein language models to predict antibody developability, significantly reducing experimental costs in therapeutic antibody design.
Contribution
It systematically identifies optimal neural architectures for property prediction, revealing that different properties require distinct attention mechanisms, and demonstrates practical utility in reducing screening costs.
Findings
Self-attention suffices for aggregation-related properties.
Bidirectional cross-attention is needed for expression yield and stability.
CrossAbSense achieves 12-20% improvements over baselines on key assays.
Abstract
Generative models can now propose thousands of \emph{de novo} antibody sequences, yet translating these designs into viable therapeutics remains constrained by the cost of biophysical characterization. Here we present CrossAbSense, a framework of property-specific neural oracles that combine frozen protein language model encoders with configurable attention decoders, identified through a systematic hyperparameter campaign totaling over 200 runs per property. On the GDPa1 benchmark of 242 therapeutic IgGs, our oracles achieve notable improvements of 12--20\% over established baselines on three of five developability assays and competitive performance on the remaining two. The central finding is that optimal decoder architectures \emph{invert} our initial biological hypotheses: self-attention alone suffices for aggregation-related properties (hydrophobic interaction chromatography,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
