Antibody Foundational Model : Ab-RoBERTa
Eunna Huh, Hyeonsu Lee, Hyunjin Shin

TL;DR
This paper introduces Ab-RoBERTa, a specialized antibody language model based on RoBERTa, designed to support antibody research with improved efficiency and accessibility, leveraging large antibody datasets.
Contribution
The study develops and releases Ab-RoBERTa, the first publicly available antibody-specific RoBERTa-based language model, tailored for antibody sequence analysis tasks.
Findings
Ab-RoBERTa demonstrates effective performance in antibody-related tasks.
The model is smaller and more efficient than previous BERT-based models.
Public availability facilitates broader antibody research applications.
Abstract
With the growing prominence of antibody-based therapeutics, antibody engineering has gained increasing attention as a critical area of research and development. Recent progress in transformer-based protein large language models (LLMs) has demonstrated promising applications in protein sequence design and structural prediction. Moreover, the availability of large-scale antibody datasets such as the Observed Antibody Space (OAS) database has opened new avenues for the development of LLMs specialized for processing antibody sequences. Among these, RoBERTa has demonstrated improved performance relative to BERT, while maintaining a smaller parameter count (125M) compared to the BERT-based protein model, ProtBERT (420M). This reduced model size enables more efficient deployment in antibody-related applications. However, despite the numerous advantages of the RoBERTa architecture,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMonoclonal and Polyclonal Antibodies Research
